(neo4j) Best practice for the number of properties in relationships and nodes
I've just started using neo4j and, having done a few experiments, am ready to start organizing the database in itself. Therefore, I've started by designing a basic diagram (on paper) and came across the following doubt:
Most examples in the material I'm using (cypher and neo4j tutorials) present only a few properties per relationship/node. But I have to wonder what the cost of having a heavy string of properties is.
Q: Is it more efficient to favor a wide variety of relationship types (GOODFRIENDS_WITH, FRIENDS_WITH, ACQUAINTANCE, RIVAL, ENEMIES, etc) or fewer types with varying properties (SEES_AS type:good friend, friend, acquaintance, rival, enemy, etc)?
The same holds for nodes. The first draft of my diagram has a staggering amount of properties (title, first name, second name, first surname, second surname, suffix, nickname, and then there's physical characteristics, personality, age, jobs...) and I'm thinking it may lower the performance of the db. Of course some nodes won't need all of the properties, but the basic properties will still be quite a few.
Q: What is the actual, and the advisable, limit for the number of properties, in both nodes and relationships?
FYI, I am going to remake my draft in such a way as to diminish the properties by using nodes instead (create a node :family names, another for :job and so on), but I've only just started thinking it over as I'll need to carefully analyse which 'would-be properties' make sense to remain, even because the change will amplify the number of relationship types I'll be dealing with.
Background information:
1) I'm using neo4j to map out all relationships between the people living in a fictional small town. The queries I'll perform will mostly be as follow:
a. find all possible paths between 2 (or more) characters
b. find all locations which 2 (or more) characters frequent
c. find all characters which have certain types of relationship (friends, cousins, neighbors, etc) to character X
d. find all characters with the same age (or similar age) who studied in the same school
e. find all characters with the same age / first name / surname / hair color / height / hobby / job / temper (easy to anger) / ...
and variations of the above.
2) I'm not a programmer, but having self-learnt HTML and advanced excel, I feel confident I'll learn the intuitive Cypher quickly enough.
neo4j
add a comment |
I've just started using neo4j and, having done a few experiments, am ready to start organizing the database in itself. Therefore, I've started by designing a basic diagram (on paper) and came across the following doubt:
Most examples in the material I'm using (cypher and neo4j tutorials) present only a few properties per relationship/node. But I have to wonder what the cost of having a heavy string of properties is.
Q: Is it more efficient to favor a wide variety of relationship types (GOODFRIENDS_WITH, FRIENDS_WITH, ACQUAINTANCE, RIVAL, ENEMIES, etc) or fewer types with varying properties (SEES_AS type:good friend, friend, acquaintance, rival, enemy, etc)?
The same holds for nodes. The first draft of my diagram has a staggering amount of properties (title, first name, second name, first surname, second surname, suffix, nickname, and then there's physical characteristics, personality, age, jobs...) and I'm thinking it may lower the performance of the db. Of course some nodes won't need all of the properties, but the basic properties will still be quite a few.
Q: What is the actual, and the advisable, limit for the number of properties, in both nodes and relationships?
FYI, I am going to remake my draft in such a way as to diminish the properties by using nodes instead (create a node :family names, another for :job and so on), but I've only just started thinking it over as I'll need to carefully analyse which 'would-be properties' make sense to remain, even because the change will amplify the number of relationship types I'll be dealing with.
Background information:
1) I'm using neo4j to map out all relationships between the people living in a fictional small town. The queries I'll perform will mostly be as follow:
a. find all possible paths between 2 (or more) characters
b. find all locations which 2 (or more) characters frequent
c. find all characters which have certain types of relationship (friends, cousins, neighbors, etc) to character X
d. find all characters with the same age (or similar age) who studied in the same school
e. find all characters with the same age / first name / surname / hair color / height / hobby / job / temper (easy to anger) / ...
and variations of the above.
2) I'm not a programmer, but having self-learnt HTML and advanced excel, I feel confident I'll learn the intuitive Cypher quickly enough.
neo4j
add a comment |
I've just started using neo4j and, having done a few experiments, am ready to start organizing the database in itself. Therefore, I've started by designing a basic diagram (on paper) and came across the following doubt:
Most examples in the material I'm using (cypher and neo4j tutorials) present only a few properties per relationship/node. But I have to wonder what the cost of having a heavy string of properties is.
Q: Is it more efficient to favor a wide variety of relationship types (GOODFRIENDS_WITH, FRIENDS_WITH, ACQUAINTANCE, RIVAL, ENEMIES, etc) or fewer types with varying properties (SEES_AS type:good friend, friend, acquaintance, rival, enemy, etc)?
The same holds for nodes. The first draft of my diagram has a staggering amount of properties (title, first name, second name, first surname, second surname, suffix, nickname, and then there's physical characteristics, personality, age, jobs...) and I'm thinking it may lower the performance of the db. Of course some nodes won't need all of the properties, but the basic properties will still be quite a few.
Q: What is the actual, and the advisable, limit for the number of properties, in both nodes and relationships?
FYI, I am going to remake my draft in such a way as to diminish the properties by using nodes instead (create a node :family names, another for :job and so on), but I've only just started thinking it over as I'll need to carefully analyse which 'would-be properties' make sense to remain, even because the change will amplify the number of relationship types I'll be dealing with.
Background information:
1) I'm using neo4j to map out all relationships between the people living in a fictional small town. The queries I'll perform will mostly be as follow:
a. find all possible paths between 2 (or more) characters
b. find all locations which 2 (or more) characters frequent
c. find all characters which have certain types of relationship (friends, cousins, neighbors, etc) to character X
d. find all characters with the same age (or similar age) who studied in the same school
e. find all characters with the same age / first name / surname / hair color / height / hobby / job / temper (easy to anger) / ...
and variations of the above.
2) I'm not a programmer, but having self-learnt HTML and advanced excel, I feel confident I'll learn the intuitive Cypher quickly enough.
neo4j
I've just started using neo4j and, having done a few experiments, am ready to start organizing the database in itself. Therefore, I've started by designing a basic diagram (on paper) and came across the following doubt:
Most examples in the material I'm using (cypher and neo4j tutorials) present only a few properties per relationship/node. But I have to wonder what the cost of having a heavy string of properties is.
Q: Is it more efficient to favor a wide variety of relationship types (GOODFRIENDS_WITH, FRIENDS_WITH, ACQUAINTANCE, RIVAL, ENEMIES, etc) or fewer types with varying properties (SEES_AS type:good friend, friend, acquaintance, rival, enemy, etc)?
The same holds for nodes. The first draft of my diagram has a staggering amount of properties (title, first name, second name, first surname, second surname, suffix, nickname, and then there's physical characteristics, personality, age, jobs...) and I'm thinking it may lower the performance of the db. Of course some nodes won't need all of the properties, but the basic properties will still be quite a few.
Q: What is the actual, and the advisable, limit for the number of properties, in both nodes and relationships?
FYI, I am going to remake my draft in such a way as to diminish the properties by using nodes instead (create a node :family names, another for :job and so on), but I've only just started thinking it over as I'll need to carefully analyse which 'would-be properties' make sense to remain, even because the change will amplify the number of relationship types I'll be dealing with.
Background information:
1) I'm using neo4j to map out all relationships between the people living in a fictional small town. The queries I'll perform will mostly be as follow:
a. find all possible paths between 2 (or more) characters
b. find all locations which 2 (or more) characters frequent
c. find all characters which have certain types of relationship (friends, cousins, neighbors, etc) to character X
d. find all characters with the same age (or similar age) who studied in the same school
e. find all characters with the same age / first name / surname / hair color / height / hobby / job / temper (easy to anger) / ...
and variations of the above.
2) I'm not a programmer, but having self-learnt HTML and advanced excel, I feel confident I'll learn the intuitive Cypher quickly enough.
neo4j
neo4j
edited Nov 15 '18 at 20:55
Tezra
5,07821143
5,07821143
asked Nov 14 '18 at 15:47
Sara CostaSara Costa
1156
1156
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
First off, for small data "sandbox" use, this is a moot point. Even with the most inefficient data layout, as long as you avoid Cartesian Products and its like, the only thing you will notice is how intuitive your data is to yourself. So if this is a "toy" scale project, just focus on what makes the most organizational sense to you. If you change your mind later, reformatting via cypher won't be too hard.
Now assuming this is a business project that needs to scale to some degree, remember that non-indexed properties are basically invisible to the Cypher planner. The more meaningful and diverse your relationships, the better the Cypher planner is going to be at finding your data quickly. Favor relationships for connections you want to be able to explore, and favor properties for data you just want to see. Index any properties or use labels that will be key for finding a particular (or set of) node(s) in your queries.
I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.
– Sara Costa
Nov 15 '18 at 21:45
1
@SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.
– Tezra
Nov 15 '18 at 21:53
thanks for putting sizes in perspective. I see what you mean now.
– Sara Costa
Nov 15 '18 at 22:11
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53303960%2fneo4j-best-practice-for-the-number-of-properties-in-relationships-and-nodes%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
First off, for small data "sandbox" use, this is a moot point. Even with the most inefficient data layout, as long as you avoid Cartesian Products and its like, the only thing you will notice is how intuitive your data is to yourself. So if this is a "toy" scale project, just focus on what makes the most organizational sense to you. If you change your mind later, reformatting via cypher won't be too hard.
Now assuming this is a business project that needs to scale to some degree, remember that non-indexed properties are basically invisible to the Cypher planner. The more meaningful and diverse your relationships, the better the Cypher planner is going to be at finding your data quickly. Favor relationships for connections you want to be able to explore, and favor properties for data you just want to see. Index any properties or use labels that will be key for finding a particular (or set of) node(s) in your queries.
I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.
– Sara Costa
Nov 15 '18 at 21:45
1
@SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.
– Tezra
Nov 15 '18 at 21:53
thanks for putting sizes in perspective. I see what you mean now.
– Sara Costa
Nov 15 '18 at 22:11
add a comment |
First off, for small data "sandbox" use, this is a moot point. Even with the most inefficient data layout, as long as you avoid Cartesian Products and its like, the only thing you will notice is how intuitive your data is to yourself. So if this is a "toy" scale project, just focus on what makes the most organizational sense to you. If you change your mind later, reformatting via cypher won't be too hard.
Now assuming this is a business project that needs to scale to some degree, remember that non-indexed properties are basically invisible to the Cypher planner. The more meaningful and diverse your relationships, the better the Cypher planner is going to be at finding your data quickly. Favor relationships for connections you want to be able to explore, and favor properties for data you just want to see. Index any properties or use labels that will be key for finding a particular (or set of) node(s) in your queries.
I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.
– Sara Costa
Nov 15 '18 at 21:45
1
@SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.
– Tezra
Nov 15 '18 at 21:53
thanks for putting sizes in perspective. I see what you mean now.
– Sara Costa
Nov 15 '18 at 22:11
add a comment |
First off, for small data "sandbox" use, this is a moot point. Even with the most inefficient data layout, as long as you avoid Cartesian Products and its like, the only thing you will notice is how intuitive your data is to yourself. So if this is a "toy" scale project, just focus on what makes the most organizational sense to you. If you change your mind later, reformatting via cypher won't be too hard.
Now assuming this is a business project that needs to scale to some degree, remember that non-indexed properties are basically invisible to the Cypher planner. The more meaningful and diverse your relationships, the better the Cypher planner is going to be at finding your data quickly. Favor relationships for connections you want to be able to explore, and favor properties for data you just want to see. Index any properties or use labels that will be key for finding a particular (or set of) node(s) in your queries.
First off, for small data "sandbox" use, this is a moot point. Even with the most inefficient data layout, as long as you avoid Cartesian Products and its like, the only thing you will notice is how intuitive your data is to yourself. So if this is a "toy" scale project, just focus on what makes the most organizational sense to you. If you change your mind later, reformatting via cypher won't be too hard.
Now assuming this is a business project that needs to scale to some degree, remember that non-indexed properties are basically invisible to the Cypher planner. The more meaningful and diverse your relationships, the better the Cypher planner is going to be at finding your data quickly. Favor relationships for connections you want to be able to explore, and favor properties for data you just want to see. Index any properties or use labels that will be key for finding a particular (or set of) node(s) in your queries.
edited Nov 21 '18 at 14:01
answered Nov 15 '18 at 21:05
TezraTezra
5,07821143
5,07821143
I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.
– Sara Costa
Nov 15 '18 at 21:45
1
@SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.
– Tezra
Nov 15 '18 at 21:53
thanks for putting sizes in perspective. I see what you mean now.
– Sara Costa
Nov 15 '18 at 22:11
add a comment |
I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.
– Sara Costa
Nov 15 '18 at 21:45
1
@SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.
– Tezra
Nov 15 '18 at 21:53
thanks for putting sizes in perspective. I see what you mean now.
– Sara Costa
Nov 15 '18 at 22:11
I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.
– Sara Costa
Nov 15 '18 at 21:45
I'll have between 100 and 200 characters, and the protagonists will have over 100 connections to people alone. It's a personal project, but it isn't exactly small. Besides, small projects can be the precursors of bigger projects so I always favour getting used to best practices, so I'll go with the second part of your answer.
– Sara Costa
Nov 15 '18 at 21:45
1
1
@SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.
– Tezra
Nov 15 '18 at 21:53
@SaraCosta Any project with less than 1 million nodes/rels, where you can change the scheme on a whim is a "toy" scale project. Best practice is slightly different for toy projects because since performance will never be a real issue, the only thing you need to worry about is you forgetting how to interact with your own data, so ease of maintenance/adaptation is your only real concern. If you worry about scale too much on a toy project, you can quickly forget why you made certain decisions, and your own data can become foreign to you. So I'd say try to favor what makes intuitive sense to you.
– Tezra
Nov 15 '18 at 21:53
thanks for putting sizes in perspective. I see what you mean now.
– Sara Costa
Nov 15 '18 at 22:11
thanks for putting sizes in perspective. I see what you mean now.
– Sara Costa
Nov 15 '18 at 22:11
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53303960%2fneo4j-best-practice-for-the-number-of-properties-in-relationships-and-nodes%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown