-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always trim string fields #101
base: main
Are you sure you want to change the base?
Conversation
"""Processor for string values""" | ||
if isinstance(value, str): | ||
return value.strip() | ||
return value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should return None
if value is not a string instance...? Need an opinion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should convert it to str
unless it is None
.
I also wonder if we should return None
if not value.strip()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both are easy to implement. I don't particullary mind but I don't feel I am the one to make such decision.
def loop(values): | ||
if not isinstance(values, Iterable): | ||
return values | ||
return [processor(value) for value in values] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should skip None
values in the resulting list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually thinking of following this PR with another one that would apply some sort of "clear empty items" on all fields returning lists of items.
But later in this comment I was told that by design we should keep processors transparent, so right now I'm not sure how much of post-processing is actually allowed. This particular PR also goes against this philosophy, though I'd argue is relatively non-invasive and seems useful.
I guess decision needs to be made at higher level whenever we want to stick to this design goal or are we deliberately going to break it next release, which would allow this and other similar PR to be included in the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+0.5 to these processors.
Even if we are processing str
, I feel it is justified to meet the schema requirements, which I feel are for all these fields to be trimmed, even if the schema only says that explicitly for article body.
I think that's fine to strip string fields, but we need to change the documentation. |
Always trim string fields.
For all fields of type
Optional[str]
orstr
applystr.strip()
through processors.For all fields of type
Optional[List[str]]
orList[str]
applystr.strip()
to all elements through processors.Extract processors to separate classes following approach found in extractors to reduce duplication in code.