(neo4j) Best practice for the number of properties in relationships and nodes










1















I've just started using neo4j and, having done a few experiments, am ready to start organizing the database in itself. Therefore, I've started by designing a basic diagram (on paper) and came across the following doubt:



Most examples in the material I'm using (cypher and neo4j tutorials) present only a few properties per relationship/node. But I have to wonder what the cost of having a heavy string of properties is.




Q: Is it more efficient to favor a wide variety of relationship types (GOODFRIENDS_WITH, FRIENDS_WITH, ACQUAINTANCE, RIVAL, ENEMIES, etc) or fewer types with varying properties (SEES_AS type:good friend, friend, acquaintance, rival, enemy, etc)?




The same holds for nodes. The first draft of my diagram has a staggering amount of properties (title, first name, second name, first surname, second surname, suffix, nickname, and then there's physical characteristics, personality, age, jobs...) and I'm thinking it may lower the performance of the db. Of course some nodes won't need all of the properties, but the basic properties will still be quite a few.




Q: What is the actual, and the advisable, limit for the number of properties, in both nodes and relationships?




FYI, I am going to remake my draft in such a way as to diminish the properties by using nodes instead (create a node :family names, another for :job and so on), but I've only just started thinking it over as I'll need to carefully analyse which 'would-be properties' make sense to remain, even because the change will amplify the number of relationship types I'll be dealing with.



Background information:



1) I'm using neo4j to map out all relationships between the people living in a fictional small town. The queries I'll perform will mostly be as follow:



a. find all possible paths between 2 (or more) characters



b. find all locations which 2 (or more) characters frequent



c. find all characters which have certain types of relationship (friends, cousins, neighbors, etc) to character X



d. find all characters with the same age (or similar age) who studied in the same school



e. find all characters with the same age / first name / surname / hair color / height / hobby / job / temper (easy to anger) / ...



and variations of the above.



2) I'm not a programmer, but having self-learnt HTML and advanced excel, I feel confident I'll learn the intuitive Cypher quickly enough.










share|improve this question




























    1















    I've just started using neo4j and, having done a few experiments, am ready to start organizing the database in itself. Therefore, I've started by designing a basic diagram (on paper) and came across the following doubt:



    Most examples in the material I'm using (cypher and neo4j tutorials) present only a few properties per relationship/node. But I have to wonder what the cost of having a heavy string of properties is.




    Q: Is it more efficient to favor a wide variety of relationship types (GOODFRIENDS_WITH, FRIENDS_WITH, ACQUAINTANCE, RIVAL, ENEMIES, etc) or fewer types with varying properties (SEES_AS type:good friend, friend, acquaintance, rival, enemy, etc)?




    The same holds for nodes. The first draft of my diagram has a staggering amount of properties (title, first name, second name, first surname, second surname, suffix, nickname, and then there's physical characteristics, personality, age, jobs...) and I'm thinking it may lower the performance of the db. Of course some nodes won't need all of the properties, but the basic properties will still be quite a few.




    Q: What is the actual, and the advisable, limit for the number of properties, in both nodes and relationships?




    FYI, I am going to remake my draft in such a way as to diminish the properties by using nodes instead (create a node :family names, another for :job and so on), but I've only just started thinking it over as I'll need to carefully analyse which 'would-be properties' make sense to remain, even because the change will amplify the number of relationship types I'll be dealing with.



    Background information:



    1) I'm using neo4j to map out all relationships between the people living in a fictional small town. The queries I'll perform will mostly be as follow:



    a. find all possible paths between 2 (or more) characters



    b. find all locations which 2 (or more) characters frequent



    c. find all characters which have certain types of relationship (friends, cousins, neighbors, etc) to character X



    d. find all characters with the same age (or similar age) who studied in the same school



    e. find all characters with the same age / first name / surname / hair color / height / hobby / job / temper (easy to anger) / ...



    and variations of the above.



    2) I'm not a programmer, but having self-learnt HTML and advanced excel, I feel confident I'll learn the intuitive Cypher quickly enough.










    share|improve this question


























      1












      1








      1








      I've just started using neo4j and, having done a few experiments, am ready to start organizing the database in itself. Therefore, I've started by designing a basic diagram (on paper) and came across the following doubt:



      Most examples in the material I'm using (cypher and neo4j tutorials) present only a few properties per relationship/node. But I have to wonder what the cost of having a heavy string of properties is.




      Q: Is it more efficient to favor a wide variety of relationship types (GOODFRIENDS_WITH, FRIENDS_WITH, ACQUAINTANCE, RIVAL, ENEMIES, etc) or fewer types with varying properties (SEES_AS type:good friend, friend, acquaintance, rival, enemy, etc)?




      The same holds for nodes. The first draft of my diagram has a staggering amount of properties (title, first name, second name, first surname, second surname, suffix, nickname, and then there's physical characteristics, personality, age, jobs...) and I'm thinking it may lower the performance of the db. Of course some nodes won't need all of the properties, but the basic properties will still be quite a few.




      Q: What is the actual, and the advisable, limit for the number of properties, in both nodes and relationships?




      FYI, I am going to remake my draft in such a way as to diminish the properties by using nodes instead (create a node :family names, another for :job and so on), but I've only just started thinking it over as I'll need to carefully analyse which 'would-be properties' make sense to remain, even because the change will amplify the number of relationship types I'll be dealing with.



      Background information:



      1) I'm using neo4j to map out all relationships between the people living in a fictional small town. The queries I'll perform will mostly be as follow:



      a. find all possible paths between 2 (or more) characters



      b. find all locations which 2 (or more) characters frequent



      c. find all characters which have certain types of relationship (friends, cousins, neighbors, etc) to character X



      d. find all characters with the same age (or similar age) who studied in the same school



      e. find all characters with the same age / first name / surname / hair color / height / hobby / job / temper (easy to anger) / ...



      and variations of the above.



      2) I'm not a programmer, but having self-learnt HTML and advanced excel, I feel confident I'll learn the intuitive Cypher quickly enough.










      share|improve this question
















      I've just started using neo4j and, having done a few experiments, am ready to start organizing the database in itself. Therefore, I've started by designing a basic diagram (on paper) and came across the following doubt:



      Most examples in the material I'm using (cypher and neo4j tutorials) present only a few properties per relationship/node. But I have to wonder what the cost of having a heavy string of properties is.




      Q: Is it more efficient to favor a wide variety of relationship types (GOODFRIENDS_WITH, FRIENDS_WITH, ACQUAINTANCE, RIVAL, ENEMIES, etc) or fewer types with varying properties (SEES_AS type:good friend, friend, acquaintance, rival, enemy, etc)?




      The same holds for nodes. The first draft of my diagram has a staggering amount of properties (title, first name, second name, first surname, second surname, suffix, nickname, and then there's physical characteristics, personality, age, jobs...) and I'm thinking it may lower the performance of the db. Of course some nodes won't need all of the properties, but the basic properties will still be quite a few.




      Q: What is the actual, and the advisable, limit for the number of properties, in both nodes and relationships?




      FYI, I am going to remake my draft in such a way as to diminish the properties by using nodes instead (create a node :family names, another for :job and so on), but I've only just started thinking it over as I'll need to carefully analyse which 'would-be properties' make sense to remain, even because the change will amplify the number of relationship types I'll be dealing with.



      Background information:



      1) I'm using neo4j to map out all relationships between the people living in a fictional small town. The queries I'll perform will mostly be as follow:



      a. find all possible paths between 2 (or more) characters



      b. find all locations which 2 (or more) characters frequent



      c. find all characters which have certain types of relationship (friends, cousins, neighbors, etc) to character X



      d. find all characters with the same age (or similar age) who studied in the same school



      e. find all characters with the same age / first name / surname / hair color / height / hobby / job / temper (easy to anger) / ...



      and variations of the above.



      2) I'm not a programmer, but having self-learnt HTML and advanced excel, I feel confident I'll learn the intuitive Cypher quickly enough.







      neo4j






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 15 '18 at 20:55









      Tezra

      5,07821143




      5,07821143










      asked Nov 14 '18 at 15:47









      Sara CostaSara Costa

      1156




      1156






















          1 Answer
          1






          active

          oldest

          votes


















          2














          First off, for small data "sandbox" use, this is a moot point. Even with the most inefficient data layout, as long as you avoid Cartesian Products and its like, the only thing you will notice is how intuitive your data is to yourself. So if this is a "toy" scale project, just focus on what makes the most organizational sense to you. If you change your mind later, reformatting via cypher won't be too hard.




          Now assuming this is a business project that needs to scale to some degree, remember that non-indexed properties are basically invisible to the Cypher planner. The more meaningful and diverse your relationships, the better the Cypher planner is going to be at finding your data quickly. Favor relationships for connections you want to be able to explore, and favor properties for data you just want to see. Index any properties or use labels that will be key for finding a particular (or set of) node(s) in your queries.






          share|improve this answer

























          • I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.

            – Sara Costa
            Nov 15 '18 at 21:45






          • 1





            @SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.

            – Tezra
            Nov 15 '18 at 21:53











          • thanks for putting sizes in perspective. I see what you mean now.

            – Sara Costa
            Nov 15 '18 at 22:11











          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53303960%2fneo4j-best-practice-for-the-number-of-properties-in-relationships-and-nodes%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2














          First off, for small data "sandbox" use, this is a moot point. Even with the most inefficient data layout, as long as you avoid Cartesian Products and its like, the only thing you will notice is how intuitive your data is to yourself. So if this is a "toy" scale project, just focus on what makes the most organizational sense to you. If you change your mind later, reformatting via cypher won't be too hard.




          Now assuming this is a business project that needs to scale to some degree, remember that non-indexed properties are basically invisible to the Cypher planner. The more meaningful and diverse your relationships, the better the Cypher planner is going to be at finding your data quickly. Favor relationships for connections you want to be able to explore, and favor properties for data you just want to see. Index any properties or use labels that will be key for finding a particular (or set of) node(s) in your queries.






          share|improve this answer

























          • I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.

            – Sara Costa
            Nov 15 '18 at 21:45






          • 1





            @SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.

            – Tezra
            Nov 15 '18 at 21:53











          • thanks for putting sizes in perspective. I see what you mean now.

            – Sara Costa
            Nov 15 '18 at 22:11
















          2














          First off, for small data "sandbox" use, this is a moot point. Even with the most inefficient data layout, as long as you avoid Cartesian Products and its like, the only thing you will notice is how intuitive your data is to yourself. So if this is a "toy" scale project, just focus on what makes the most organizational sense to you. If you change your mind later, reformatting via cypher won't be too hard.




          Now assuming this is a business project that needs to scale to some degree, remember that non-indexed properties are basically invisible to the Cypher planner. The more meaningful and diverse your relationships, the better the Cypher planner is going to be at finding your data quickly. Favor relationships for connections you want to be able to explore, and favor properties for data you just want to see. Index any properties or use labels that will be key for finding a particular (or set of) node(s) in your queries.






          share|improve this answer

























          • I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.

            – Sara Costa
            Nov 15 '18 at 21:45






          • 1





            @SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.

            – Tezra
            Nov 15 '18 at 21:53











          • thanks for putting sizes in perspective. I see what you mean now.

            – Sara Costa
            Nov 15 '18 at 22:11














          2












          2








          2







          First off, for small data "sandbox" use, this is a moot point. Even with the most inefficient data layout, as long as you avoid Cartesian Products and its like, the only thing you will notice is how intuitive your data is to yourself. So if this is a "toy" scale project, just focus on what makes the most organizational sense to you. If you change your mind later, reformatting via cypher won't be too hard.




          Now assuming this is a business project that needs to scale to some degree, remember that non-indexed properties are basically invisible to the Cypher planner. The more meaningful and diverse your relationships, the better the Cypher planner is going to be at finding your data quickly. Favor relationships for connections you want to be able to explore, and favor properties for data you just want to see. Index any properties or use labels that will be key for finding a particular (or set of) node(s) in your queries.






          share|improve this answer















          First off, for small data "sandbox" use, this is a moot point. Even with the most inefficient data layout, as long as you avoid Cartesian Products and its like, the only thing you will notice is how intuitive your data is to yourself. So if this is a "toy" scale project, just focus on what makes the most organizational sense to you. If you change your mind later, reformatting via cypher won't be too hard.




          Now assuming this is a business project that needs to scale to some degree, remember that non-indexed properties are basically invisible to the Cypher planner. The more meaningful and diverse your relationships, the better the Cypher planner is going to be at finding your data quickly. Favor relationships for connections you want to be able to explore, and favor properties for data you just want to see. Index any properties or use labels that will be key for finding a particular (or set of) node(s) in your queries.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 21 '18 at 14:01

























          answered Nov 15 '18 at 21:05









          TezraTezra

          5,07821143




          5,07821143












          • I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.

            – Sara Costa
            Nov 15 '18 at 21:45






          • 1





            @SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.

            – Tezra
            Nov 15 '18 at 21:53











          • thanks for putting sizes in perspective. I see what you mean now.

            – Sara Costa
            Nov 15 '18 at 22:11


















          • I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.

            – Sara Costa
            Nov 15 '18 at 21:45






          • 1





            @SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.

            – Tezra
            Nov 15 '18 at 21:53











          • thanks for putting sizes in perspective. I see what you mean now.

            – Sara Costa
            Nov 15 '18 at 22:11

















          I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.

          – Sara Costa
          Nov 15 '18 at 21:45





          I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.

          – Sara Costa
          Nov 15 '18 at 21:45




          1




          1





          @SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.

          – Tezra
          Nov 15 '18 at 21:53





          @SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.

          – Tezra
          Nov 15 '18 at 21:53













          thanks for putting sizes in perspective. I see what you mean now.

          – Sara Costa
          Nov 15 '18 at 22:11






          thanks for putting sizes in perspective. I see what you mean now.

          – Sara Costa
          Nov 15 '18 at 22:11




















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53303960%2fneo4j-best-practice-for-the-number-of-properties-in-relationships-and-nodes%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          27

          Top Tejano songwriter Luis Silva dead of heart attack at 64

          Category:Rhetoric