eserde – a serde that just doesn't stop

54 points by weinzierl 4 days ago

    The API consumer is then forced into a slow and frustrating feedback loop:

    1. Send request
    2. Receive a single error back
    3. Fix the error
    4. Back to 1., until there are no more errors to be fixed

    That's a poor developer experience. We should do better!

Like, "write an actually useful documentation so the developer doesn't have to experiment with remote API, like a blind man groping the elephant"?.. oh. Right, that's a human/organization problem; therefore unsolvable.

terhechte a day ago
Not everybody has this luxury. I once worked on a project where the BE dev had a fundamental issue understanding types (or his PHP code did, who knows), such that, for example, a json array would change its type when empty, e.g.
```
  // Non-Empty
  {"objects": ["a", "b"]}

  // Empty
  {"objects": {}}
```
and many other such issues.
- aeldidi a day ago
  
  No kidding. I work with an API at my job that does the following:
  1. Instead of omitting optional values or at least setting them to null, optional values are set to an empty string when not present (This is including string fields where an empty string would be valid).
  2. Lists are represented in such a way that an empty string means an empty list, a list with one element is simply the element, and only lists with more than one element actually use JSON arrays. For example, to represent a list of points ({ x: number, y: number }):
  // Empty array "points": "" // Single element "points": { x: 7, y: 28 } // 2 or more elements "points": [{ x: 7, y: 28 }, { x: 12, y: 4 }]
  3. Everything is stored in a string. The number 5? “5”. A Boolean? “true” or “false”.
  What’s funny is I was able to use serde to create wrapper types for each of these so these atrocities were abstracted away. For example, those arrays I described are simply Vec<T> in the actual types with an annotation using #[serde_as(as = “CursedArray<T>”)][0]. Likewise something similar for the string encoded types as well.
  [0]: https://docs.rs/serde_with/3.12.0/serde_with/guide/serde_as/...
  
  terhechte a day ago
  
  Man, I feel you. I did the same, the issue is that I had to regularly update my wrapper types as new exceptions occurred
- lesuorac a day ago
  
  IIRC, that's because arrays and objects (Associative Arrays) are the same thing in PHP so an empty array can serialized differently than a filled array because it doesn't know if it's associative or not.
- andai a day ago
  
  Adding this to my PHP Horrors museum.
leoedin a day ago

You can have the best documentation in the world, but if it’s documenting a complex json object with many different combinations of fields, and the user has made a couple of typos, that doesn’t help.
An API which provides more complete failure feedback is a good thing. No need for snark.
kelnos a day ago

This is a bit of a silly take. Even with fantastic API documentation, I've sometimes -- ok, more than sometimes -- made mistakes when crafting my API requests.
- genewitch a day ago
  
  Hand craft the json in bash and then call the API from bash.
  We do whats we have to to make the lights blink.

kelnos 2 days ago

I started reading this a bit skeptically, but my main concern was addressed: this doesn't reimplement JSON deserialization; eserde's deserializer first uses serde_json's deserializer, and if it succeeds, it returns the result. So you don't have to trust someone else's deserializer that might not be maintained as well as serde_json is.

The downside to this approach is that if serde_json's deserializer encounters an error, they then have to deserialize the input again with their own deserializer (that wraps serde_json's) in order to get all the errors out. Fortunately the happy path is just as fast as using serde_json directly, but the unhappy path will be at least twice as slow as serde_json when it encounters an error. But I think this could be an acceptable trade off for some people.

What would be really nice, though, would be to change serde/serde_json/serde_${WHATEVER} so that it can do this (optionally, since even with built-in support, the failure case will still be slower and take up more resources). I expect it could be done in such a way to maintain API/ABI compat with the current state of things.

jrimbault a day ago

I wonder if the feature could be added to serde itself and gated behind a feature flag.
- kelnos a day ago
  
  I'm sure it could be (possibly breaking the current serde API), but I think I'd prefer it be a runtime option. That way you can toggle it on and off for different uses of different APIs.
  For example, my service might deserialize in two different places: 1) when deserializing request bodies sent from clients, and 2) when deserializing response bodies when making API calls to other services.
  I might want to enable this feature for #1 (so I can return complete error messages to clients), but disable it for #2 (because I expect this issue to be rare, and don't want to incur a performance penalty when handling deserialization errors if remote API servers give me something I don't expect).
terhechte a day ago

There was a discussion about this on the Rust Reddit. Can't find the link right now as I'm on a very slow internet connection.
- ibotty a day ago
  
  https://old.reddit.com/r/rust/comments/1ish5vd/eserde_dont_s...
cchance a day ago

this is what i was thinking, why isn't this just a feature on serde itself

epage a day ago

I was really excited by this but unfortunately this has major blockers in the use cases I care about (e.g. inside of `cargo`).

Generally if you care about error recovery, you also care about error formatting. Libraries like `serde_json` and `toml` provide the information necessary for you to provide custom formatting for the error messages so they can look more like rustc's output but eserde throws this away by calling `to_string` on the error messages and storing that.

Deebster 2 days ago

That’s quite useful in the example given (passing errors back to clients), but I wonder if sometimes these others errors are artifacts from the first error - it would be more annoying to have these false negatives (and wasting time understanding that that’s what they are) than having to retry.

the_gipsy a day ago

For a moment I was holding my breath - thinking this would do something stupid like setting the values to `Default`, or even trying to guess defaults, whatever.

But no! This is actually really cool, to get all the errors at once!

Timwi a day ago

You came so close. You realized that stopping at the first error is bad and you should continue. But then you still discard the entire deserialization when there's even a single error. You only needed one more step to realize that it might as well return the partial result alongside the errors.

The most common error in deserialization is when a field is missing. The most common reason for it missing is because it was added; an older version of the API didn't have it. Having the deserializer just bail out means the API cannot introduce a new field with a sensible default value and instead has to create a whole separate API endpoint that does the same thing.

Qwuke a day ago

Can you explain how this would work if there weren't optional fields in your struct? Because if you're just suggesting that you use optional fields, that already works with vanilla serde.
Also, defaults for fields are already a vanilla serde feature.
the_gipsy a day ago

No, you don't understand, but YOU are close ;) don't give up!
The whole point is to Make Illegal States Unrepresentable™. You see, if you WANT to allow a partial result, then you just encode it into the type with Option<T>, Result<T,E> or your own types. But if you DON'T want them, then you can sleep tight, assured that there won't be some nulls/nils/undefineds/zero-values creeping into the deepest core of your logic. Illegal values will be stopped at the border and denied entry. Values that are allowed to be "missing" (or similar, you can express a lot more here) can come in, but we know that they can be missing and must be handled on access (instead of NPEs, crashes, or worse: "default zero values").
- btown a day ago
  
  In the spirit of http://www.catb.org/jargon/html/Z/Zawinskis-Law.html ...
  "Every program attempts to expand until its clients and counterparties are so varied in protocol version (and implementation competency) that it must deserialize every field as optional, and handle said absence in business logic. Those programs which cannot so expand are replaced by ones which can."
  (On a related note... I would give the world for Python to have optional chaining/safe navigation operators. The fact that Typescript is the Language that Makes Untrusted APIs Tolerable, while Python is the Language of Low-Level AI Iteration, and both have horrible ergonomics when trying to implement the other, is a source of endless frustration to me.)
OptionOfT a day ago

But you can set defaults for fields when they are missing with serde.