image retrieval system from text query









up vote
0
down vote

favorite












I'm going to implement an image Search Engine, basically the goal is to let a user search in a repository of images by text query. Moreover I have to handle the crawling part of this project.



I am studying Information Retrieval and I have a basic understanding of Lucene, Solr and Nutch. Tools I have to use



So my questions, which are a bit "abstract" at this stage of the project, are :



  • How do I crawl for images?

I do not have constraints on which images to have in my dataset, i just need something around 1000 files. the first option is just to use random images but maybe there is something better i can do (i.e. building image description while crawling or something like that)



  • How do I index those images?

Again, i think i will need at least a description of each images, or maybe a list of descriptors... is there any service to build it dynamically based on the image?



Once this is build the rest of the work should be trivial since i will have a dataset and an index but if you have some suggesting feel free to give advices.










share|improve this question

























    up vote
    0
    down vote

    favorite












    I'm going to implement an image Search Engine, basically the goal is to let a user search in a repository of images by text query. Moreover I have to handle the crawling part of this project.



    I am studying Information Retrieval and I have a basic understanding of Lucene, Solr and Nutch. Tools I have to use



    So my questions, which are a bit "abstract" at this stage of the project, are :



    • How do I crawl for images?

    I do not have constraints on which images to have in my dataset, i just need something around 1000 files. the first option is just to use random images but maybe there is something better i can do (i.e. building image description while crawling or something like that)



    • How do I index those images?

    Again, i think i will need at least a description of each images, or maybe a list of descriptors... is there any service to build it dynamically based on the image?



    Once this is build the rest of the work should be trivial since i will have a dataset and an index but if you have some suggesting feel free to give advices.










    share|improve this question























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I'm going to implement an image Search Engine, basically the goal is to let a user search in a repository of images by text query. Moreover I have to handle the crawling part of this project.



      I am studying Information Retrieval and I have a basic understanding of Lucene, Solr and Nutch. Tools I have to use



      So my questions, which are a bit "abstract" at this stage of the project, are :



      • How do I crawl for images?

      I do not have constraints on which images to have in my dataset, i just need something around 1000 files. the first option is just to use random images but maybe there is something better i can do (i.e. building image description while crawling or something like that)



      • How do I index those images?

      Again, i think i will need at least a description of each images, or maybe a list of descriptors... is there any service to build it dynamically based on the image?



      Once this is build the rest of the work should be trivial since i will have a dataset and an index but if you have some suggesting feel free to give advices.










      share|improve this question













      I'm going to implement an image Search Engine, basically the goal is to let a user search in a repository of images by text query. Moreover I have to handle the crawling part of this project.



      I am studying Information Retrieval and I have a basic understanding of Lucene, Solr and Nutch. Tools I have to use



      So my questions, which are a bit "abstract" at this stage of the project, are :



      • How do I crawl for images?

      I do not have constraints on which images to have in my dataset, i just need something around 1000 files. the first option is just to use random images but maybe there is something better i can do (i.e. building image description while crawling or something like that)



      • How do I index those images?

      Again, i think i will need at least a description of each images, or maybe a list of descriptors... is there any service to build it dynamically based on the image?



      Once this is build the rest of the work should be trivial since i will have a dataset and an index but if you have some suggesting feel free to give advices.







      web-crawler search-engine information-retrieval google-image-search






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 10 at 14:44









      Simone Masiero

      1036




      1036



























          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53240055%2fimage-retrieval-system-from-text-query%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown






























          active

          oldest

          votes













          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes















           

          draft saved


          draft discarded















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53240055%2fimage-retrieval-system-from-text-query%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Top Tejano songwriter Luis Silva dead of heart attack at 64

          ReactJS Fetched API data displays live - need Data displayed static

          政党