How can I use ANTLR to automate source code navigation?










1














I'd like my app to have a basic understanding of source codes of multiple languages in order to automate code navigation.



  • E.g. I want it to understand that some text is a variable, that the variable is of a certain type and that the type is defined in a specific file.

  • I don't want to define grammars myself, I'd like to use some open source tools

From what I understand I need a lexer / tokenizer.



After a little bit of research, I found ANTLR, which has quite a few grammars already defined.



I'd like to accomplish the 3 goals:



  1. Provide grammar file for language X

  2. Provide some source code for language X

  3. Get the tokenized code, so I can navigate it

My preferred technology is C#, but Python for even some hybrid approach with docker embedded ANTLR would also be fine.



Can anyone provide me with a quick-start example?



ANTLR even has a C# port: ANTLRCS. I cannot find any examples of how to use it though.



If there are better approaches than using ANTLR, please do not hesitate to share :)










share|improve this question























  • You are making the mistake of assuming that 1) if you have a parser, you can get useful type information easily, 2) that the you have accurate source location information and 3) that the grammars operate in similar way so one tool can treat them similarly. See my essay on Life After Parsing: semanticdesigns.com/Products/DMS/LifeAfterParsing.html
    – Ira Baxter
    Nov 12 at 20:49










  • Stackoverflow is not well suited for "a quick-start example" (the internet is full of that). I recommend you to do some research yourself, and try to do what you outline. If you get stuck along the way, feel free to ask a specific question on SO. Good luck!
    – Bart Kiers
    Nov 12 at 21:03










  • @BartKiers I tried to play around with ANTLRCS, but got stuck like immediately. There's literally no docs about the workflow nor working examples/tutorials I could find on the web. Maybe I used bad keywords, but honestly, I couldn't find anything useful.
    – Andrzej Gis
    Nov 12 at 21:40










  • 1) Even if your app is in C#, you should be able to use Java ANTLR (the generator) and only set it to generate the parser itself in C# (i.e. C# being the target language, not the language the tool itself is written in): github.com/antlr/antlr4/tree/master/runtime/CSharp
    – Jiri Tousek
    Nov 19 at 8:42










  • 2) It's not enough to tokenize the input for your purposes - variable name and type name are both just identifiers to the lexer. So you need the full parser, not just lexer.
    – Jiri Tousek
    Nov 19 at 8:43















1














I'd like my app to have a basic understanding of source codes of multiple languages in order to automate code navigation.



  • E.g. I want it to understand that some text is a variable, that the variable is of a certain type and that the type is defined in a specific file.

  • I don't want to define grammars myself, I'd like to use some open source tools

From what I understand I need a lexer / tokenizer.



After a little bit of research, I found ANTLR, which has quite a few grammars already defined.



I'd like to accomplish the 3 goals:



  1. Provide grammar file for language X

  2. Provide some source code for language X

  3. Get the tokenized code, so I can navigate it

My preferred technology is C#, but Python for even some hybrid approach with docker embedded ANTLR would also be fine.



Can anyone provide me with a quick-start example?



ANTLR even has a C# port: ANTLRCS. I cannot find any examples of how to use it though.



If there are better approaches than using ANTLR, please do not hesitate to share :)










share|improve this question























  • You are making the mistake of assuming that 1) if you have a parser, you can get useful type information easily, 2) that the you have accurate source location information and 3) that the grammars operate in similar way so one tool can treat them similarly. See my essay on Life After Parsing: semanticdesigns.com/Products/DMS/LifeAfterParsing.html
    – Ira Baxter
    Nov 12 at 20:49










  • Stackoverflow is not well suited for "a quick-start example" (the internet is full of that). I recommend you to do some research yourself, and try to do what you outline. If you get stuck along the way, feel free to ask a specific question on SO. Good luck!
    – Bart Kiers
    Nov 12 at 21:03










  • @BartKiers I tried to play around with ANTLRCS, but got stuck like immediately. There's literally no docs about the workflow nor working examples/tutorials I could find on the web. Maybe I used bad keywords, but honestly, I couldn't find anything useful.
    – Andrzej Gis
    Nov 12 at 21:40










  • 1) Even if your app is in C#, you should be able to use Java ANTLR (the generator) and only set it to generate the parser itself in C# (i.e. C# being the target language, not the language the tool itself is written in): github.com/antlr/antlr4/tree/master/runtime/CSharp
    – Jiri Tousek
    Nov 19 at 8:42










  • 2) It's not enough to tokenize the input for your purposes - variable name and type name are both just identifiers to the lexer. So you need the full parser, not just lexer.
    – Jiri Tousek
    Nov 19 at 8:43













1












1








1







I'd like my app to have a basic understanding of source codes of multiple languages in order to automate code navigation.



  • E.g. I want it to understand that some text is a variable, that the variable is of a certain type and that the type is defined in a specific file.

  • I don't want to define grammars myself, I'd like to use some open source tools

From what I understand I need a lexer / tokenizer.



After a little bit of research, I found ANTLR, which has quite a few grammars already defined.



I'd like to accomplish the 3 goals:



  1. Provide grammar file for language X

  2. Provide some source code for language X

  3. Get the tokenized code, so I can navigate it

My preferred technology is C#, but Python for even some hybrid approach with docker embedded ANTLR would also be fine.



Can anyone provide me with a quick-start example?



ANTLR even has a C# port: ANTLRCS. I cannot find any examples of how to use it though.



If there are better approaches than using ANTLR, please do not hesitate to share :)










share|improve this question















I'd like my app to have a basic understanding of source codes of multiple languages in order to automate code navigation.



  • E.g. I want it to understand that some text is a variable, that the variable is of a certain type and that the type is defined in a specific file.

  • I don't want to define grammars myself, I'd like to use some open source tools

From what I understand I need a lexer / tokenizer.



After a little bit of research, I found ANTLR, which has quite a few grammars already defined.



I'd like to accomplish the 3 goals:



  1. Provide grammar file for language X

  2. Provide some source code for language X

  3. Get the tokenized code, so I can navigate it

My preferred technology is C#, but Python for even some hybrid approach with docker embedded ANTLR would also be fine.



Can anyone provide me with a quick-start example?



ANTLR even has a C# port: ANTLRCS. I cannot find any examples of how to use it though.



If there are better approaches than using ANTLR, please do not hesitate to share :)







c# python antlr






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 at 20:58









Elham Esmaeeli

798




798










asked Nov 12 at 20:34









Andrzej Gis

6,58185297




6,58185297











  • You are making the mistake of assuming that 1) if you have a parser, you can get useful type information easily, 2) that the you have accurate source location information and 3) that the grammars operate in similar way so one tool can treat them similarly. See my essay on Life After Parsing: semanticdesigns.com/Products/DMS/LifeAfterParsing.html
    – Ira Baxter
    Nov 12 at 20:49










  • Stackoverflow is not well suited for "a quick-start example" (the internet is full of that). I recommend you to do some research yourself, and try to do what you outline. If you get stuck along the way, feel free to ask a specific question on SO. Good luck!
    – Bart Kiers
    Nov 12 at 21:03










  • @BartKiers I tried to play around with ANTLRCS, but got stuck like immediately. There's literally no docs about the workflow nor working examples/tutorials I could find on the web. Maybe I used bad keywords, but honestly, I couldn't find anything useful.
    – Andrzej Gis
    Nov 12 at 21:40










  • 1) Even if your app is in C#, you should be able to use Java ANTLR (the generator) and only set it to generate the parser itself in C# (i.e. C# being the target language, not the language the tool itself is written in): github.com/antlr/antlr4/tree/master/runtime/CSharp
    – Jiri Tousek
    Nov 19 at 8:42










  • 2) It's not enough to tokenize the input for your purposes - variable name and type name are both just identifiers to the lexer. So you need the full parser, not just lexer.
    – Jiri Tousek
    Nov 19 at 8:43
















  • You are making the mistake of assuming that 1) if you have a parser, you can get useful type information easily, 2) that the you have accurate source location information and 3) that the grammars operate in similar way so one tool can treat them similarly. See my essay on Life After Parsing: semanticdesigns.com/Products/DMS/LifeAfterParsing.html
    – Ira Baxter
    Nov 12 at 20:49










  • Stackoverflow is not well suited for "a quick-start example" (the internet is full of that). I recommend you to do some research yourself, and try to do what you outline. If you get stuck along the way, feel free to ask a specific question on SO. Good luck!
    – Bart Kiers
    Nov 12 at 21:03










  • @BartKiers I tried to play around with ANTLRCS, but got stuck like immediately. There's literally no docs about the workflow nor working examples/tutorials I could find on the web. Maybe I used bad keywords, but honestly, I couldn't find anything useful.
    – Andrzej Gis
    Nov 12 at 21:40










  • 1) Even if your app is in C#, you should be able to use Java ANTLR (the generator) and only set it to generate the parser itself in C# (i.e. C# being the target language, not the language the tool itself is written in): github.com/antlr/antlr4/tree/master/runtime/CSharp
    – Jiri Tousek
    Nov 19 at 8:42










  • 2) It's not enough to tokenize the input for your purposes - variable name and type name are both just identifiers to the lexer. So you need the full parser, not just lexer.
    – Jiri Tousek
    Nov 19 at 8:43















You are making the mistake of assuming that 1) if you have a parser, you can get useful type information easily, 2) that the you have accurate source location information and 3) that the grammars operate in similar way so one tool can treat them similarly. See my essay on Life After Parsing: semanticdesigns.com/Products/DMS/LifeAfterParsing.html
– Ira Baxter
Nov 12 at 20:49




You are making the mistake of assuming that 1) if you have a parser, you can get useful type information easily, 2) that the you have accurate source location information and 3) that the grammars operate in similar way so one tool can treat them similarly. See my essay on Life After Parsing: semanticdesigns.com/Products/DMS/LifeAfterParsing.html
– Ira Baxter
Nov 12 at 20:49












Stackoverflow is not well suited for "a quick-start example" (the internet is full of that). I recommend you to do some research yourself, and try to do what you outline. If you get stuck along the way, feel free to ask a specific question on SO. Good luck!
– Bart Kiers
Nov 12 at 21:03




Stackoverflow is not well suited for "a quick-start example" (the internet is full of that). I recommend you to do some research yourself, and try to do what you outline. If you get stuck along the way, feel free to ask a specific question on SO. Good luck!
– Bart Kiers
Nov 12 at 21:03












@BartKiers I tried to play around with ANTLRCS, but got stuck like immediately. There's literally no docs about the workflow nor working examples/tutorials I could find on the web. Maybe I used bad keywords, but honestly, I couldn't find anything useful.
– Andrzej Gis
Nov 12 at 21:40




@BartKiers I tried to play around with ANTLRCS, but got stuck like immediately. There's literally no docs about the workflow nor working examples/tutorials I could find on the web. Maybe I used bad keywords, but honestly, I couldn't find anything useful.
– Andrzej Gis
Nov 12 at 21:40












1) Even if your app is in C#, you should be able to use Java ANTLR (the generator) and only set it to generate the parser itself in C# (i.e. C# being the target language, not the language the tool itself is written in): github.com/antlr/antlr4/tree/master/runtime/CSharp
– Jiri Tousek
Nov 19 at 8:42




1) Even if your app is in C#, you should be able to use Java ANTLR (the generator) and only set it to generate the parser itself in C# (i.e. C# being the target language, not the language the tool itself is written in): github.com/antlr/antlr4/tree/master/runtime/CSharp
– Jiri Tousek
Nov 19 at 8:42












2) It's not enough to tokenize the input for your purposes - variable name and type name are both just identifiers to the lexer. So you need the full parser, not just lexer.
– Jiri Tousek
Nov 19 at 8:43




2) It's not enough to tokenize the input for your purposes - variable name and type name are both just identifiers to the lexer. So you need the full parser, not just lexer.
– Jiri Tousek
Nov 19 at 8:43

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53269695%2fhow-can-i-use-antlr-to-automate-source-code-navigation%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53269695%2fhow-can-i-use-antlr-to-automate-source-code-navigation%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Top Tejano songwriter Luis Silva dead of heart attack at 64

Can't figure out why I get Error loading static resource from app.xaml

天津地下鉄3号線