Clean way to unescape a single unicode char in python json.dumps(array)?
I need to send some data to a server that I do not own or operate. The data must be sent in a form parameter via HTTP POST. The key must be "logs" and the value must be a JSON array. Each element of this array is a CSV string, delimited by the unicode representation of Ctrl-A, u0001 (not the literal char).
When I convert my array of CSV strings to a JSON array, via json.dumps, it implicitly escapes some characters (such as ").
Problem: json.dumps also implicitly escapes my CSV delimiter u0001, changing it to \u0001, which causes the server to reject my data. To work-around this, I manually "un"-escape it: s.replace('\\u0001', '\u0001')
Question: Are there any potential repercussions to using this hacky workaround? Is there a more elegant way for me to handle this problem?
# Ctrl-A (u0001) delimited CSV strings
logs = ['VAL1\u0001"key":"VAL2"', 'VAL1\u0001"key":"VAL2"']
# Serialize as JSON (it implicitly escapes chars, including Ctrl-A)
serialized_logs = json.dumps(logs)
# replace '\u0001' with 'u0001' (unescape it)
# this seems HACKY -- is there a better way to handle this?
serialized_logs = serialized_logs.replace('\\u0001', '\u0001')
# send over HTTP
params = 'logs' : serialized_logs
response = requests.post(url, data=params)
Note: python 2.7
python json string unicode escaping
|
show 3 more comments
I need to send some data to a server that I do not own or operate. The data must be sent in a form parameter via HTTP POST. The key must be "logs" and the value must be a JSON array. Each element of this array is a CSV string, delimited by the unicode representation of Ctrl-A, u0001 (not the literal char).
When I convert my array of CSV strings to a JSON array, via json.dumps, it implicitly escapes some characters (such as ").
Problem: json.dumps also implicitly escapes my CSV delimiter u0001, changing it to \u0001, which causes the server to reject my data. To work-around this, I manually "un"-escape it: s.replace('\\u0001', '\u0001')
Question: Are there any potential repercussions to using this hacky workaround? Is there a more elegant way for me to handle this problem?
# Ctrl-A (u0001) delimited CSV strings
logs = ['VAL1\u0001"key":"VAL2"', 'VAL1\u0001"key":"VAL2"']
# Serialize as JSON (it implicitly escapes chars, including Ctrl-A)
serialized_logs = json.dumps(logs)
# replace '\u0001' with 'u0001' (unescape it)
# this seems HACKY -- is there a better way to handle this?
serialized_logs = serialized_logs.replace('\\u0001', '\u0001')
# send over HTTP
params = 'logs' : serialized_logs
response = requests.post(url, data=params)
Note: python 2.7
python json string unicode escaping
1
I dont understand why you dont just send the normal byte... why does it have to be escaped?
– Joran Beasley
Nov 16 '18 at 0:10
@JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.
– James Wierzba
Nov 16 '18 at 0:14
repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?
– Joran Beasley
Nov 16 '18 at 0:16
1
I suspect you are misinterpreting the services documentation ...
– Joran Beasley
Nov 16 '18 at 0:18
@JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP isu0001, not\u0001
– James Wierzba
Nov 16 '18 at 0:18
|
show 3 more comments
I need to send some data to a server that I do not own or operate. The data must be sent in a form parameter via HTTP POST. The key must be "logs" and the value must be a JSON array. Each element of this array is a CSV string, delimited by the unicode representation of Ctrl-A, u0001 (not the literal char).
When I convert my array of CSV strings to a JSON array, via json.dumps, it implicitly escapes some characters (such as ").
Problem: json.dumps also implicitly escapes my CSV delimiter u0001, changing it to \u0001, which causes the server to reject my data. To work-around this, I manually "un"-escape it: s.replace('\\u0001', '\u0001')
Question: Are there any potential repercussions to using this hacky workaround? Is there a more elegant way for me to handle this problem?
# Ctrl-A (u0001) delimited CSV strings
logs = ['VAL1\u0001"key":"VAL2"', 'VAL1\u0001"key":"VAL2"']
# Serialize as JSON (it implicitly escapes chars, including Ctrl-A)
serialized_logs = json.dumps(logs)
# replace '\u0001' with 'u0001' (unescape it)
# this seems HACKY -- is there a better way to handle this?
serialized_logs = serialized_logs.replace('\\u0001', '\u0001')
# send over HTTP
params = 'logs' : serialized_logs
response = requests.post(url, data=params)
Note: python 2.7
python json string unicode escaping
I need to send some data to a server that I do not own or operate. The data must be sent in a form parameter via HTTP POST. The key must be "logs" and the value must be a JSON array. Each element of this array is a CSV string, delimited by the unicode representation of Ctrl-A, u0001 (not the literal char).
When I convert my array of CSV strings to a JSON array, via json.dumps, it implicitly escapes some characters (such as ").
Problem: json.dumps also implicitly escapes my CSV delimiter u0001, changing it to \u0001, which causes the server to reject my data. To work-around this, I manually "un"-escape it: s.replace('\\u0001', '\u0001')
Question: Are there any potential repercussions to using this hacky workaround? Is there a more elegant way for me to handle this problem?
# Ctrl-A (u0001) delimited CSV strings
logs = ['VAL1\u0001"key":"VAL2"', 'VAL1\u0001"key":"VAL2"']
# Serialize as JSON (it implicitly escapes chars, including Ctrl-A)
serialized_logs = json.dumps(logs)
# replace '\u0001' with 'u0001' (unescape it)
# this seems HACKY -- is there a better way to handle this?
serialized_logs = serialized_logs.replace('\\u0001', '\u0001')
# send over HTTP
params = 'logs' : serialized_logs
response = requests.post(url, data=params)
Note: python 2.7
python json string unicode escaping
python json string unicode escaping
edited Nov 16 '18 at 0:46
James Wierzba
asked Nov 16 '18 at 0:02
James WierzbaJames Wierzba
6,97242965
6,97242965
1
I dont understand why you dont just send the normal byte... why does it have to be escaped?
– Joran Beasley
Nov 16 '18 at 0:10
@JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.
– James Wierzba
Nov 16 '18 at 0:14
repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?
– Joran Beasley
Nov 16 '18 at 0:16
1
I suspect you are misinterpreting the services documentation ...
– Joran Beasley
Nov 16 '18 at 0:18
@JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP isu0001, not\u0001
– James Wierzba
Nov 16 '18 at 0:18
|
show 3 more comments
1
I dont understand why you dont just send the normal byte... why does it have to be escaped?
– Joran Beasley
Nov 16 '18 at 0:10
@JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.
– James Wierzba
Nov 16 '18 at 0:14
repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?
– Joran Beasley
Nov 16 '18 at 0:16
1
I suspect you are misinterpreting the services documentation ...
– Joran Beasley
Nov 16 '18 at 0:18
@JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP isu0001, not\u0001
– James Wierzba
Nov 16 '18 at 0:18
1
1
I dont understand why you dont just send the normal byte... why does it have to be escaped?
– Joran Beasley
Nov 16 '18 at 0:10
I dont understand why you dont just send the normal byte... why does it have to be escaped?
– Joran Beasley
Nov 16 '18 at 0:10
@JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.
– James Wierzba
Nov 16 '18 at 0:14
@JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.
– James Wierzba
Nov 16 '18 at 0:14
repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?
– Joran Beasley
Nov 16 '18 at 0:16
repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?
– Joran Beasley
Nov 16 '18 at 0:16
1
1
I suspect you are misinterpreting the services documentation ...
– Joran Beasley
Nov 16 '18 at 0:18
I suspect you are misinterpreting the services documentation ...
– Joran Beasley
Nov 16 '18 at 0:18
@JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP is
u0001, not \u0001– James Wierzba
Nov 16 '18 at 0:18
@JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP is
u0001, not \u0001– James Wierzba
Nov 16 '18 at 0:18
|
show 3 more comments
1 Answer
1
active
oldest
votes
just use unicode strings in your list and dont escape the unicode
logs2 = [u'VAL1u0001"key":"VAL2"', u'VAL1u0001"key":"VAL2"']
serialized_logs2 = json.dumps(logs2)
should do the right thing you can verify it by
print(serialized_logs2 == serialized_logs.replace("\\u0001","\u0001"))
(where serialized_logs is your json.dumps result above)
see: https://repl.it/@JoranBeasley/SoreGrimQuotient (python2)
What is the difference betweenu"u0001"and"\u0001"?
– James Wierzba
Nov 16 '18 at 0:44
oh wait are you in python3 ... in python3 you dont need the u escape...u"s"just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)
– Joran Beasley
Nov 16 '18 at 0:45
1
I'm in python 2, sorry I should have specified
– James Wierzba
Nov 16 '18 at 0:45
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53329556%2fclean-way-to-unescape-a-single-unicode-char-in-python-json-dumpsarray%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
just use unicode strings in your list and dont escape the unicode
logs2 = [u'VAL1u0001"key":"VAL2"', u'VAL1u0001"key":"VAL2"']
serialized_logs2 = json.dumps(logs2)
should do the right thing you can verify it by
print(serialized_logs2 == serialized_logs.replace("\\u0001","\u0001"))
(where serialized_logs is your json.dumps result above)
see: https://repl.it/@JoranBeasley/SoreGrimQuotient (python2)
What is the difference betweenu"u0001"and"\u0001"?
– James Wierzba
Nov 16 '18 at 0:44
oh wait are you in python3 ... in python3 you dont need the u escape...u"s"just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)
– Joran Beasley
Nov 16 '18 at 0:45
1
I'm in python 2, sorry I should have specified
– James Wierzba
Nov 16 '18 at 0:45
add a comment |
just use unicode strings in your list and dont escape the unicode
logs2 = [u'VAL1u0001"key":"VAL2"', u'VAL1u0001"key":"VAL2"']
serialized_logs2 = json.dumps(logs2)
should do the right thing you can verify it by
print(serialized_logs2 == serialized_logs.replace("\\u0001","\u0001"))
(where serialized_logs is your json.dumps result above)
see: https://repl.it/@JoranBeasley/SoreGrimQuotient (python2)
What is the difference betweenu"u0001"and"\u0001"?
– James Wierzba
Nov 16 '18 at 0:44
oh wait are you in python3 ... in python3 you dont need the u escape...u"s"just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)
– Joran Beasley
Nov 16 '18 at 0:45
1
I'm in python 2, sorry I should have specified
– James Wierzba
Nov 16 '18 at 0:45
add a comment |
just use unicode strings in your list and dont escape the unicode
logs2 = [u'VAL1u0001"key":"VAL2"', u'VAL1u0001"key":"VAL2"']
serialized_logs2 = json.dumps(logs2)
should do the right thing you can verify it by
print(serialized_logs2 == serialized_logs.replace("\\u0001","\u0001"))
(where serialized_logs is your json.dumps result above)
see: https://repl.it/@JoranBeasley/SoreGrimQuotient (python2)
just use unicode strings in your list and dont escape the unicode
logs2 = [u'VAL1u0001"key":"VAL2"', u'VAL1u0001"key":"VAL2"']
serialized_logs2 = json.dumps(logs2)
should do the right thing you can verify it by
print(serialized_logs2 == serialized_logs.replace("\\u0001","\u0001"))
(where serialized_logs is your json.dumps result above)
see: https://repl.it/@JoranBeasley/SoreGrimQuotient (python2)
edited Nov 16 '18 at 0:47
answered Nov 16 '18 at 0:32
Joran BeasleyJoran Beasley
73.8k682121
73.8k682121
What is the difference betweenu"u0001"and"\u0001"?
– James Wierzba
Nov 16 '18 at 0:44
oh wait are you in python3 ... in python3 you dont need the u escape...u"s"just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)
– Joran Beasley
Nov 16 '18 at 0:45
1
I'm in python 2, sorry I should have specified
– James Wierzba
Nov 16 '18 at 0:45
add a comment |
What is the difference betweenu"u0001"and"\u0001"?
– James Wierzba
Nov 16 '18 at 0:44
oh wait are you in python3 ... in python3 you dont need the u escape...u"s"just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)
– Joran Beasley
Nov 16 '18 at 0:45
1
I'm in python 2, sorry I should have specified
– James Wierzba
Nov 16 '18 at 0:45
What is the difference between
u"u0001" and "\u0001"?– James Wierzba
Nov 16 '18 at 0:44
What is the difference between
u"u0001" and "\u0001"?– James Wierzba
Nov 16 '18 at 0:44
oh wait are you in python3 ... in python3 you dont need the u escape...
u"s" just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)– Joran Beasley
Nov 16 '18 at 0:45
oh wait are you in python3 ... in python3 you dont need the u escape...
u"s" just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)– Joran Beasley
Nov 16 '18 at 0:45
1
1
I'm in python 2, sorry I should have specified
– James Wierzba
Nov 16 '18 at 0:45
I'm in python 2, sorry I should have specified
– James Wierzba
Nov 16 '18 at 0:45
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53329556%2fclean-way-to-unescape-a-single-unicode-char-in-python-json-dumpsarray%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
I dont understand why you dont just send the normal byte... why does it have to be escaped?
– Joran Beasley
Nov 16 '18 at 0:10
@JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.
– James Wierzba
Nov 16 '18 at 0:14
repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?
– Joran Beasley
Nov 16 '18 at 0:16
1
I suspect you are misinterpreting the services documentation ...
– Joran Beasley
Nov 16 '18 at 0:18
@JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP is
u0001, not\u0001– James Wierzba
Nov 16 '18 at 0:18