Clean way to unescape a single unicode char in python json.dumps(array)?










-1















I need to send some data to a server that I do not own or operate. The data must be sent in a form parameter via HTTP POST. The key must be "logs" and the value must be a JSON array. Each element of this array is a CSV string, delimited by the unicode representation of Ctrl-A, u0001 (not the literal char).



When I convert my array of CSV strings to a JSON array, via json.dumps, it implicitly escapes some characters (such as ").



Problem: json.dumps also implicitly escapes my CSV delimiter u0001, changing it to \u0001, which causes the server to reject my data. To work-around this, I manually "un"-escape it: s.replace('\\u0001', '\u0001')



Question: Are there any potential repercussions to using this hacky workaround? Is there a more elegant way for me to handle this problem?



# Ctrl-A (u0001) delimited CSV strings
logs = ['VAL1\u0001"key":"VAL2"', 'VAL1\u0001"key":"VAL2"']

# Serialize as JSON (it implicitly escapes chars, including Ctrl-A)
serialized_logs = json.dumps(logs)

# replace '\u0001' with 'u0001' (unescape it)
# this seems HACKY -- is there a better way to handle this?
serialized_logs = serialized_logs.replace('\\u0001', '\u0001')

# send over HTTP
params = 'logs' : serialized_logs
response = requests.post(url, data=params)


Note: python 2.7










share|improve this question



















  • 1





    I dont understand why you dont just send the normal byte... why does it have to be escaped?

    – Joran Beasley
    Nov 16 '18 at 0:10











  • @JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.

    – James Wierzba
    Nov 16 '18 at 0:14











  • repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?

    – Joran Beasley
    Nov 16 '18 at 0:16







  • 1





    I suspect you are misinterpreting the services documentation ...

    – Joran Beasley
    Nov 16 '18 at 0:18











  • @JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP is u0001, not \u0001

    – James Wierzba
    Nov 16 '18 at 0:18















-1















I need to send some data to a server that I do not own or operate. The data must be sent in a form parameter via HTTP POST. The key must be "logs" and the value must be a JSON array. Each element of this array is a CSV string, delimited by the unicode representation of Ctrl-A, u0001 (not the literal char).



When I convert my array of CSV strings to a JSON array, via json.dumps, it implicitly escapes some characters (such as ").



Problem: json.dumps also implicitly escapes my CSV delimiter u0001, changing it to \u0001, which causes the server to reject my data. To work-around this, I manually "un"-escape it: s.replace('\\u0001', '\u0001')



Question: Are there any potential repercussions to using this hacky workaround? Is there a more elegant way for me to handle this problem?



# Ctrl-A (u0001) delimited CSV strings
logs = ['VAL1\u0001"key":"VAL2"', 'VAL1\u0001"key":"VAL2"']

# Serialize as JSON (it implicitly escapes chars, including Ctrl-A)
serialized_logs = json.dumps(logs)

# replace '\u0001' with 'u0001' (unescape it)
# this seems HACKY -- is there a better way to handle this?
serialized_logs = serialized_logs.replace('\\u0001', '\u0001')

# send over HTTP
params = 'logs' : serialized_logs
response = requests.post(url, data=params)


Note: python 2.7










share|improve this question



















  • 1





    I dont understand why you dont just send the normal byte... why does it have to be escaped?

    – Joran Beasley
    Nov 16 '18 at 0:10











  • @JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.

    – James Wierzba
    Nov 16 '18 at 0:14











  • repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?

    – Joran Beasley
    Nov 16 '18 at 0:16







  • 1





    I suspect you are misinterpreting the services documentation ...

    – Joran Beasley
    Nov 16 '18 at 0:18











  • @JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP is u0001, not \u0001

    – James Wierzba
    Nov 16 '18 at 0:18













-1












-1








-1








I need to send some data to a server that I do not own or operate. The data must be sent in a form parameter via HTTP POST. The key must be "logs" and the value must be a JSON array. Each element of this array is a CSV string, delimited by the unicode representation of Ctrl-A, u0001 (not the literal char).



When I convert my array of CSV strings to a JSON array, via json.dumps, it implicitly escapes some characters (such as ").



Problem: json.dumps also implicitly escapes my CSV delimiter u0001, changing it to \u0001, which causes the server to reject my data. To work-around this, I manually "un"-escape it: s.replace('\\u0001', '\u0001')



Question: Are there any potential repercussions to using this hacky workaround? Is there a more elegant way for me to handle this problem?



# Ctrl-A (u0001) delimited CSV strings
logs = ['VAL1\u0001"key":"VAL2"', 'VAL1\u0001"key":"VAL2"']

# Serialize as JSON (it implicitly escapes chars, including Ctrl-A)
serialized_logs = json.dumps(logs)

# replace '\u0001' with 'u0001' (unescape it)
# this seems HACKY -- is there a better way to handle this?
serialized_logs = serialized_logs.replace('\\u0001', '\u0001')

# send over HTTP
params = 'logs' : serialized_logs
response = requests.post(url, data=params)


Note: python 2.7










share|improve this question
















I need to send some data to a server that I do not own or operate. The data must be sent in a form parameter via HTTP POST. The key must be "logs" and the value must be a JSON array. Each element of this array is a CSV string, delimited by the unicode representation of Ctrl-A, u0001 (not the literal char).



When I convert my array of CSV strings to a JSON array, via json.dumps, it implicitly escapes some characters (such as ").



Problem: json.dumps also implicitly escapes my CSV delimiter u0001, changing it to \u0001, which causes the server to reject my data. To work-around this, I manually "un"-escape it: s.replace('\\u0001', '\u0001')



Question: Are there any potential repercussions to using this hacky workaround? Is there a more elegant way for me to handle this problem?



# Ctrl-A (u0001) delimited CSV strings
logs = ['VAL1\u0001"key":"VAL2"', 'VAL1\u0001"key":"VAL2"']

# Serialize as JSON (it implicitly escapes chars, including Ctrl-A)
serialized_logs = json.dumps(logs)

# replace '\u0001' with 'u0001' (unescape it)
# this seems HACKY -- is there a better way to handle this?
serialized_logs = serialized_logs.replace('\\u0001', '\u0001')

# send over HTTP
params = 'logs' : serialized_logs
response = requests.post(url, data=params)


Note: python 2.7







python json string unicode escaping






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 16 '18 at 0:46







James Wierzba

















asked Nov 16 '18 at 0:02









James WierzbaJames Wierzba

6,97242965




6,97242965







  • 1





    I dont understand why you dont just send the normal byte... why does it have to be escaped?

    – Joran Beasley
    Nov 16 '18 at 0:10











  • @JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.

    – James Wierzba
    Nov 16 '18 at 0:14











  • repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?

    – Joran Beasley
    Nov 16 '18 at 0:16







  • 1





    I suspect you are misinterpreting the services documentation ...

    – Joran Beasley
    Nov 16 '18 at 0:18











  • @JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP is u0001, not \u0001

    – James Wierzba
    Nov 16 '18 at 0:18












  • 1





    I dont understand why you dont just send the normal byte... why does it have to be escaped?

    – Joran Beasley
    Nov 16 '18 at 0:10











  • @JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.

    – James Wierzba
    Nov 16 '18 at 0:14











  • repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?

    – Joran Beasley
    Nov 16 '18 at 0:16







  • 1





    I suspect you are misinterpreting the services documentation ...

    – Joran Beasley
    Nov 16 '18 at 0:18











  • @JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP is u0001, not \u0001

    – James Wierzba
    Nov 16 '18 at 0:18







1




1





I dont understand why you dont just send the normal byte... why does it have to be escaped?

– Joran Beasley
Nov 16 '18 at 0:10





I dont understand why you dont just send the normal byte... why does it have to be escaped?

– Joran Beasley
Nov 16 '18 at 0:10













@JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.

– James Wierzba
Nov 16 '18 at 0:14





@JoranBeasley Hi again Joran, thanks for coming back. I am wondering that very same thing. This is a requirement imposed by the service, and I have no control over this. I agree, it is strange.

– James Wierzba
Nov 16 '18 at 0:14













repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?

– Joran Beasley
Nov 16 '18 at 0:16






repl.it/@JoranBeasley/SoreGrimQuotient looks like it is still escaped here ... with only one ... did you delete your other question?

– Joran Beasley
Nov 16 '18 at 0:16





1




1





I suspect you are misinterpreting the services documentation ...

– Joran Beasley
Nov 16 '18 at 0:18





I suspect you are misinterpreting the services documentation ...

– Joran Beasley
Nov 16 '18 at 0:18













@JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP is u0001, not \u0001

– James Wierzba
Nov 16 '18 at 0:18





@JoranBeasley The backslash is escaped though. The chars that I want to be sent over HTTP is u0001, not \u0001

– James Wierzba
Nov 16 '18 at 0:18












1 Answer
1






active

oldest

votes


















0














just use unicode strings in your list and dont escape the unicode



logs2 = [u'VAL1u0001"key":"VAL2"', u'VAL1u0001"key":"VAL2"'] 
serialized_logs2 = json.dumps(logs2)


should do the right thing you can verify it by



print(serialized_logs2 == serialized_logs.replace("\\u0001","\u0001"))


(where serialized_logs is your json.dumps result above)



see: https://repl.it/@JoranBeasley/SoreGrimQuotient (python2)






share|improve this answer

























  • What is the difference between u"u0001" and "\u0001"?

    – James Wierzba
    Nov 16 '18 at 0:44












  • oh wait are you in python3 ... in python3 you dont need the u escape... u"s" just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)

    – Joran Beasley
    Nov 16 '18 at 0:45






  • 1





    I'm in python 2, sorry I should have specified

    – James Wierzba
    Nov 16 '18 at 0:45










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53329556%2fclean-way-to-unescape-a-single-unicode-char-in-python-json-dumpsarray%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














just use unicode strings in your list and dont escape the unicode



logs2 = [u'VAL1u0001"key":"VAL2"', u'VAL1u0001"key":"VAL2"'] 
serialized_logs2 = json.dumps(logs2)


should do the right thing you can verify it by



print(serialized_logs2 == serialized_logs.replace("\\u0001","\u0001"))


(where serialized_logs is your json.dumps result above)



see: https://repl.it/@JoranBeasley/SoreGrimQuotient (python2)






share|improve this answer

























  • What is the difference between u"u0001" and "\u0001"?

    – James Wierzba
    Nov 16 '18 at 0:44












  • oh wait are you in python3 ... in python3 you dont need the u escape... u"s" just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)

    – Joran Beasley
    Nov 16 '18 at 0:45






  • 1





    I'm in python 2, sorry I should have specified

    – James Wierzba
    Nov 16 '18 at 0:45















0














just use unicode strings in your list and dont escape the unicode



logs2 = [u'VAL1u0001"key":"VAL2"', u'VAL1u0001"key":"VAL2"'] 
serialized_logs2 = json.dumps(logs2)


should do the right thing you can verify it by



print(serialized_logs2 == serialized_logs.replace("\\u0001","\u0001"))


(where serialized_logs is your json.dumps result above)



see: https://repl.it/@JoranBeasley/SoreGrimQuotient (python2)






share|improve this answer

























  • What is the difference between u"u0001" and "\u0001"?

    – James Wierzba
    Nov 16 '18 at 0:44












  • oh wait are you in python3 ... in python3 you dont need the u escape... u"s" just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)

    – Joran Beasley
    Nov 16 '18 at 0:45






  • 1





    I'm in python 2, sorry I should have specified

    – James Wierzba
    Nov 16 '18 at 0:45













0












0








0







just use unicode strings in your list and dont escape the unicode



logs2 = [u'VAL1u0001"key":"VAL2"', u'VAL1u0001"key":"VAL2"'] 
serialized_logs2 = json.dumps(logs2)


should do the right thing you can verify it by



print(serialized_logs2 == serialized_logs.replace("\\u0001","\u0001"))


(where serialized_logs is your json.dumps result above)



see: https://repl.it/@JoranBeasley/SoreGrimQuotient (python2)






share|improve this answer















just use unicode strings in your list and dont escape the unicode



logs2 = [u'VAL1u0001"key":"VAL2"', u'VAL1u0001"key":"VAL2"'] 
serialized_logs2 = json.dumps(logs2)


should do the right thing you can verify it by



print(serialized_logs2 == serialized_logs.replace("\\u0001","\u0001"))


(where serialized_logs is your json.dumps result above)



see: https://repl.it/@JoranBeasley/SoreGrimQuotient (python2)







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 16 '18 at 0:47

























answered Nov 16 '18 at 0:32









Joran BeasleyJoran Beasley

73.8k682121




73.8k682121












  • What is the difference between u"u0001" and "\u0001"?

    – James Wierzba
    Nov 16 '18 at 0:44












  • oh wait are you in python3 ... in python3 you dont need the u escape... u"s" just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)

    – Joran Beasley
    Nov 16 '18 at 0:45






  • 1





    I'm in python 2, sorry I should have specified

    – James Wierzba
    Nov 16 '18 at 0:45

















  • What is the difference between u"u0001" and "\u0001"?

    – James Wierzba
    Nov 16 '18 at 0:44












  • oh wait are you in python3 ... in python3 you dont need the u escape... u"s" just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)

    – Joran Beasley
    Nov 16 '18 at 0:45






  • 1





    I'm in python 2, sorry I should have specified

    – James Wierzba
    Nov 16 '18 at 0:45
















What is the difference between u"u0001" and "\u0001"?

– James Wierzba
Nov 16 '18 at 0:44






What is the difference between u"u0001" and "\u0001"?

– James Wierzba
Nov 16 '18 at 0:44














oh wait are you in python3 ... in python3 you dont need the u escape... u"s" just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)

– Joran Beasley
Nov 16 '18 at 0:45





oh wait are you in python3 ... in python3 you dont need the u escape... u"s" just makes sure its a unicode string ... "s" is just a bytestring (in python3 that changes)

– Joran Beasley
Nov 16 '18 at 0:45




1




1





I'm in python 2, sorry I should have specified

– James Wierzba
Nov 16 '18 at 0:45





I'm in python 2, sorry I should have specified

– James Wierzba
Nov 16 '18 at 0:45



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53329556%2fclean-way-to-unescape-a-single-unicode-char-in-python-json-dumpsarray%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

27

Top Tejano songwriter Luis Silva dead of heart attack at 64

Category:Rhetoric