-
Notifications
You must be signed in to change notification settings - Fork 1.8k
escape special chars in csv and json output. #802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
1f56de0
2cb120a
5db5aab
807edf4
f3ceed0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -160,13 +160,19 @@ std::string StrFormat(const char* format, ...) { | |||||||
| return tmp; | ||||||||
| } | ||||||||
|
|
||||||||
| void ReplaceAll(std::string* str, const std::string& from, | ||||||||
| const std::string& to) { | ||||||||
| std::size_t start = 0; | ||||||||
| while ((start = str->find(from, start)) != std::string::npos) { | ||||||||
| str->replace(start, from.length(), to); | ||||||||
| start += to.length(); | ||||||||
| } | ||||||||
| std::string StrEscape(const std::string & s) { | ||||||||
| std::string tmp; | ||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. tmp.reserve(s.size());
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. feel free to take over the PR
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean, it's your PR.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. my use case isn't going to notice the 2 ns (maybe?) speed improvement, if you need it feel free to add it though.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a review comment, it is good to address them, so that there is forward progress on proposed changes.
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you do realize that std::string is already designed for this kind of concatenation? this is trading 0-1-2 reallocs (depending on the length of the key) with a guaranteed two calls (strlen,realloc), not sure if there's a real win here, unless it's an issue of coding style?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We know the escaped string will be at least not shorter than the input string.
It is well-known that |
||||||||
| std::for_each(s.begin(), s.end(), [&tmp](char c) { | ||||||||
| switch (c) { | ||||||||
| case '\b': tmp += "\\b"; break; | ||||||||
| case '\f': tmp += "\\f"; break; | ||||||||
| case '\n': tmp += "\\n"; break; | ||||||||
| case '\r': tmp += "\\r"; break; | ||||||||
| case '\t': tmp += "\\t"; break; | ||||||||
| case '\\': tmp += "\\\\"; break; | ||||||||
| case '"' : tmp += "\\\""; break; | ||||||||
| default : tmp += c; break; } } ); | ||||||||
| return tmp; | ||||||||
| } | ||||||||
|
|
||||||||
| #ifdef BENCHMARK_STL_ANDROID_GNUSTL | ||||||||
|
|
||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please link where it's documented what should be escaped in CSV?
I'm probably looking in wrong places, https://tools.ietf.org/html/rfc4180 only says to replace
"with""?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there even a standard for csv? at least \r and \n need to be escaped, \b and \f probably should be, \t is a matter of taste- the tests dont check for it in any case. and anyway, isn't csv support being deprecated #500 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing that in a quick test with loading
into libreoffice:

And replacing
\nwith "\n":breaks it:

Thus yes, i'm curious as to motivation/documentation.
Yep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right, \t and \ should proabaly not be escaped for csv. the other ones I think the "right thing" is to escape them. \b because if you're lookign at csv, you typically want to see it on a terminal, and any \b would end up hidden, and they're more often than not in there by mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that JSON changes are good.
The issue does not mention anything about CSV; are you affected by CSV side of things?
How about a middle ground solution then?
reserve()suggestion.