Python get request and retrieving data from search
I'm trying to use the requests module to retrieve data from this website:
https://toelatingen.ctgb.nl/
I want to receive the found data when I put in "11462" the "Zoekterm" field for example.
data = "searchTerm": "11462"
session = requests.Session()
r = session.post('https://toelatingen.ctgb.nl/',data=data)
body_data = r.text
The content of the body_data does not, unfortunately, contain the information searched for.
Thanks for helping me.
python search python-requests
add a comment |
I'm trying to use the requests module to retrieve data from this website:
https://toelatingen.ctgb.nl/
I want to receive the found data when I put in "11462" the "Zoekterm" field for example.
data = "searchTerm": "11462"
session = requests.Session()
r = session.post('https://toelatingen.ctgb.nl/',data=data)
body_data = r.text
The content of the body_data does not, unfortunately, contain the information searched for.
Thanks for helping me.
python search python-requests
the post data it need is longer than you inputed, i suggest you try to use selenium to make job easier ,if you do not insist using requests. (use fiddler to watch how long it is)
– kcorlidy
Nov 14 '18 at 9:17
add a comment |
I'm trying to use the requests module to retrieve data from this website:
https://toelatingen.ctgb.nl/
I want to receive the found data when I put in "11462" the "Zoekterm" field for example.
data = "searchTerm": "11462"
session = requests.Session()
r = session.post('https://toelatingen.ctgb.nl/',data=data)
body_data = r.text
The content of the body_data does not, unfortunately, contain the information searched for.
Thanks for helping me.
python search python-requests
I'm trying to use the requests module to retrieve data from this website:
https://toelatingen.ctgb.nl/
I want to receive the found data when I put in "11462" the "Zoekterm" field for example.
data = "searchTerm": "11462"
session = requests.Session()
r = session.post('https://toelatingen.ctgb.nl/',data=data)
body_data = r.text
The content of the body_data does not, unfortunately, contain the information searched for.
Thanks for helping me.
python search python-requests
python search python-requests
asked Nov 13 '18 at 20:35
Brian BarbieriBrian Barbieri
103
103
the post data it need is longer than you inputed, i suggest you try to use selenium to make job easier ,if you do not insist using requests. (use fiddler to watch how long it is)
– kcorlidy
Nov 14 '18 at 9:17
add a comment |
the post data it need is longer than you inputed, i suggest you try to use selenium to make job easier ,if you do not insist using requests. (use fiddler to watch how long it is)
– kcorlidy
Nov 14 '18 at 9:17
the post data it need is longer than you inputed, i suggest you try to use selenium to make job easier ,if you do not insist using requests. (use fiddler to watch how long it is)
– kcorlidy
Nov 14 '18 at 9:17
the post data it need is longer than you inputed, i suggest you try to use selenium to make job easier ,if you do not insist using requests. (use fiddler to watch how long it is)
– kcorlidy
Nov 14 '18 at 9:17
add a comment |
1 Answer
1
active
oldest
votes
The reason you're not getting the response data is because the site doesn't do the search at that url. Instead it makes a call to https://toelatingen.ctgb.nl/nl/admissions/overview.
When you're trying to get information off the internet the first thing you want to do is check how your web browser is getting the data. If you open up whatever inspection tool comes with your browser of choice (typically the hotkey is ctrl+shift+i), you should be able to find a Network tab that tracks the requests and responses the browser makes. Once you have that open, get your browser to display the information you want and watch the Network Tab while it's doing so. Check whatever responses come up to find the one that has the information you want and then replicate the request your browser used.
In your case:
- The root page loads an empty page first from https://toelatingen.ctgb.nl/
- It then loads a bunch of static files (mostly woff and js; these are used for styling the webpage and handling different proceedures)
- Then it makes a call to https://toelatingen.ctgb.nl/nl/admissions/overview. We can be pretty sure this is the call we want at this point because the response is a json which contains the information that we see displayed on the screen.
- We then copy out all the information- headers and forms, line for line- from that request, plug it in, and see if the
requests
module returns the same json. - If it doesn't then that most likely means we're missing something (most often a CSRF Token or a special Accept-Encoding) and we need to do some more tinkering.
- I would also recommend taking a little bit of time to prune out parts of the request data/headers: most of the time they contain extra terms that the server doesn't actually need. This will save space and give you a better idea of what parts of the request you can change.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53289085%2fpython-get-request-and-retrieving-data-from-search%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The reason you're not getting the response data is because the site doesn't do the search at that url. Instead it makes a call to https://toelatingen.ctgb.nl/nl/admissions/overview.
When you're trying to get information off the internet the first thing you want to do is check how your web browser is getting the data. If you open up whatever inspection tool comes with your browser of choice (typically the hotkey is ctrl+shift+i), you should be able to find a Network tab that tracks the requests and responses the browser makes. Once you have that open, get your browser to display the information you want and watch the Network Tab while it's doing so. Check whatever responses come up to find the one that has the information you want and then replicate the request your browser used.
In your case:
- The root page loads an empty page first from https://toelatingen.ctgb.nl/
- It then loads a bunch of static files (mostly woff and js; these are used for styling the webpage and handling different proceedures)
- Then it makes a call to https://toelatingen.ctgb.nl/nl/admissions/overview. We can be pretty sure this is the call we want at this point because the response is a json which contains the information that we see displayed on the screen.
- We then copy out all the information- headers and forms, line for line- from that request, plug it in, and see if the
requests
module returns the same json. - If it doesn't then that most likely means we're missing something (most often a CSRF Token or a special Accept-Encoding) and we need to do some more tinkering.
- I would also recommend taking a little bit of time to prune out parts of the request data/headers: most of the time they contain extra terms that the server doesn't actually need. This will save space and give you a better idea of what parts of the request you can change.
add a comment |
The reason you're not getting the response data is because the site doesn't do the search at that url. Instead it makes a call to https://toelatingen.ctgb.nl/nl/admissions/overview.
When you're trying to get information off the internet the first thing you want to do is check how your web browser is getting the data. If you open up whatever inspection tool comes with your browser of choice (typically the hotkey is ctrl+shift+i), you should be able to find a Network tab that tracks the requests and responses the browser makes. Once you have that open, get your browser to display the information you want and watch the Network Tab while it's doing so. Check whatever responses come up to find the one that has the information you want and then replicate the request your browser used.
In your case:
- The root page loads an empty page first from https://toelatingen.ctgb.nl/
- It then loads a bunch of static files (mostly woff and js; these are used for styling the webpage and handling different proceedures)
- Then it makes a call to https://toelatingen.ctgb.nl/nl/admissions/overview. We can be pretty sure this is the call we want at this point because the response is a json which contains the information that we see displayed on the screen.
- We then copy out all the information- headers and forms, line for line- from that request, plug it in, and see if the
requests
module returns the same json. - If it doesn't then that most likely means we're missing something (most often a CSRF Token or a special Accept-Encoding) and we need to do some more tinkering.
- I would also recommend taking a little bit of time to prune out parts of the request data/headers: most of the time they contain extra terms that the server doesn't actually need. This will save space and give you a better idea of what parts of the request you can change.
add a comment |
The reason you're not getting the response data is because the site doesn't do the search at that url. Instead it makes a call to https://toelatingen.ctgb.nl/nl/admissions/overview.
When you're trying to get information off the internet the first thing you want to do is check how your web browser is getting the data. If you open up whatever inspection tool comes with your browser of choice (typically the hotkey is ctrl+shift+i), you should be able to find a Network tab that tracks the requests and responses the browser makes. Once you have that open, get your browser to display the information you want and watch the Network Tab while it's doing so. Check whatever responses come up to find the one that has the information you want and then replicate the request your browser used.
In your case:
- The root page loads an empty page first from https://toelatingen.ctgb.nl/
- It then loads a bunch of static files (mostly woff and js; these are used for styling the webpage and handling different proceedures)
- Then it makes a call to https://toelatingen.ctgb.nl/nl/admissions/overview. We can be pretty sure this is the call we want at this point because the response is a json which contains the information that we see displayed on the screen.
- We then copy out all the information- headers and forms, line for line- from that request, plug it in, and see if the
requests
module returns the same json. - If it doesn't then that most likely means we're missing something (most often a CSRF Token or a special Accept-Encoding) and we need to do some more tinkering.
- I would also recommend taking a little bit of time to prune out parts of the request data/headers: most of the time they contain extra terms that the server doesn't actually need. This will save space and give you a better idea of what parts of the request you can change.
The reason you're not getting the response data is because the site doesn't do the search at that url. Instead it makes a call to https://toelatingen.ctgb.nl/nl/admissions/overview.
When you're trying to get information off the internet the first thing you want to do is check how your web browser is getting the data. If you open up whatever inspection tool comes with your browser of choice (typically the hotkey is ctrl+shift+i), you should be able to find a Network tab that tracks the requests and responses the browser makes. Once you have that open, get your browser to display the information you want and watch the Network Tab while it's doing so. Check whatever responses come up to find the one that has the information you want and then replicate the request your browser used.
In your case:
- The root page loads an empty page first from https://toelatingen.ctgb.nl/
- It then loads a bunch of static files (mostly woff and js; these are used for styling the webpage and handling different proceedures)
- Then it makes a call to https://toelatingen.ctgb.nl/nl/admissions/overview. We can be pretty sure this is the call we want at this point because the response is a json which contains the information that we see displayed on the screen.
- We then copy out all the information- headers and forms, line for line- from that request, plug it in, and see if the
requests
module returns the same json. - If it doesn't then that most likely means we're missing something (most often a CSRF Token or a special Accept-Encoding) and we need to do some more tinkering.
- I would also recommend taking a little bit of time to prune out parts of the request data/headers: most of the time they contain extra terms that the server doesn't actually need. This will save space and give you a better idea of what parts of the request you can change.
answered Nov 14 '18 at 20:17
Reid BallardReid Ballard
914816
914816
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53289085%2fpython-get-request-and-retrieving-data-from-search%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
the post data it need is longer than you inputed, i suggest you try to use selenium to make job easier ,if you do not insist using requests. (use fiddler to watch how long it is)
– kcorlidy
Nov 14 '18 at 9:17