XRegexP.matchRecursive - add callback functionality to allow for multiple identical instances










0














I am using XRegexP to parse a text file specifically to find the contents between two sets of pre-defined comment tags, I'm not able to change these tags so I need to find a way to make it work with the text provided.



I find a list of all of the tags using the regex provided (example in link also includes sample content): https://regex101.com/r/kCwyok/1/



I've then used XRegexP's matchRecursive function to get all the content in between the opening and closing tags which all works - almost - perfectly.



// Map the list of component tags and extract data from them
return generateComponentList(data).map((component) =>
console.log(chalk.blue('Processing', component[1], 'component.'))
const contents = XRegExp.matchRecursive(data, '<!-- @\[' + component[1] + '\][.\w-_+]* -->', '<!-- @\[/' + component[1] + '\] -->', 'g')
let body = ''
let classes = ''

contents.map((content) =>
const filteredContent = filterContent(content)
body = filteredContent.value
classes = cleanClasses(component[2])
console.log(chalk.green(component[1], 'processing complete.'))
)

// Output the content as a JSON object
return
componentName: component[1],
classes,
body

)


The problem I have is that the CodeExample tag exists twice, the tag is identical but the content is different, however, because matchRecursive doesn't appear to have a callback function, it just runs the match on all instances of that component at the same time so it doesn't matter if there are 1 or 10 instances of CodeExample the content for all of them is returned.



Is there a way I CAN actually add some sort of callback to matchRecursive? Failing that is there a way I can make JavaScript understand which instance of CodeExample is being looked at so I can just reference the array position directly? I presume XRegexP has an idea of which number CodeExample tag it's looking at, so is there a way to capture it?



Here is the full code for sake of clarity: https://pastebin.com/2MpdvdNA



The desired output I want is a JSON file with the following data:



[

componentName: "hero",
classes: "",
body: "# Creating new contexts"
,

componentName: "CodeExample",
classes: "",
body: "## Usage example

```javascript
Import ICON_NAME from 'Icons'
```"
,

componentName: "ArticleSection",
classes: "",
body: // This section is massive and not relevant to question so skipping
,

componentName: "NoteBlock",
classes: ["warning"],
body: "> #### Be Careful
> Eu laboris eiusmod ut exercitation minim laboris ipsum magna consectetur est [commodo](/nope)."
,

componentName: "CodeExample",
classes: "",
body: "#### Code example
```javascript
class ScrollingList extends React.Component
constructor(props)
super(props);
this.listRef = React.createRef();


render()
return (
<div ref=this.listRef>/* ...contents... */</div>
);


```"

// Skipping the rest as not relevant to question
]


Sorry if I've not explained this clearly, I've been looking at this for far too long.










share|improve this question























  • There are a lot of variables and functions in the code that haven't been defined or explained. Assuming that the text in the regex101 is your input (as well as the tag names in component), can you post your desired output?
    – CertainPerformance
    Nov 10 at 21:58










  • I've updated the question with some extra details, hope that helps :)
    – Alex Foxleigh
    Nov 10 at 22:06















0














I am using XRegexP to parse a text file specifically to find the contents between two sets of pre-defined comment tags, I'm not able to change these tags so I need to find a way to make it work with the text provided.



I find a list of all of the tags using the regex provided (example in link also includes sample content): https://regex101.com/r/kCwyok/1/



I've then used XRegexP's matchRecursive function to get all the content in between the opening and closing tags which all works - almost - perfectly.



// Map the list of component tags and extract data from them
return generateComponentList(data).map((component) =>
console.log(chalk.blue('Processing', component[1], 'component.'))
const contents = XRegExp.matchRecursive(data, '<!-- @\[' + component[1] + '\][.\w-_+]* -->', '<!-- @\[/' + component[1] + '\] -->', 'g')
let body = ''
let classes = ''

contents.map((content) =>
const filteredContent = filterContent(content)
body = filteredContent.value
classes = cleanClasses(component[2])
console.log(chalk.green(component[1], 'processing complete.'))
)

// Output the content as a JSON object
return
componentName: component[1],
classes,
body

)


The problem I have is that the CodeExample tag exists twice, the tag is identical but the content is different, however, because matchRecursive doesn't appear to have a callback function, it just runs the match on all instances of that component at the same time so it doesn't matter if there are 1 or 10 instances of CodeExample the content for all of them is returned.



Is there a way I CAN actually add some sort of callback to matchRecursive? Failing that is there a way I can make JavaScript understand which instance of CodeExample is being looked at so I can just reference the array position directly? I presume XRegexP has an idea of which number CodeExample tag it's looking at, so is there a way to capture it?



Here is the full code for sake of clarity: https://pastebin.com/2MpdvdNA



The desired output I want is a JSON file with the following data:



[

componentName: "hero",
classes: "",
body: "# Creating new contexts"
,

componentName: "CodeExample",
classes: "",
body: "## Usage example

```javascript
Import ICON_NAME from 'Icons'
```"
,

componentName: "ArticleSection",
classes: "",
body: // This section is massive and not relevant to question so skipping
,

componentName: "NoteBlock",
classes: ["warning"],
body: "> #### Be Careful
> Eu laboris eiusmod ut exercitation minim laboris ipsum magna consectetur est [commodo](/nope)."
,

componentName: "CodeExample",
classes: "",
body: "#### Code example
```javascript
class ScrollingList extends React.Component
constructor(props)
super(props);
this.listRef = React.createRef();


render()
return (
<div ref=this.listRef>/* ...contents... */</div>
);


```"

// Skipping the rest as not relevant to question
]


Sorry if I've not explained this clearly, I've been looking at this for far too long.










share|improve this question























  • There are a lot of variables and functions in the code that haven't been defined or explained. Assuming that the text in the regex101 is your input (as well as the tag names in component), can you post your desired output?
    – CertainPerformance
    Nov 10 at 21:58










  • I've updated the question with some extra details, hope that helps :)
    – Alex Foxleigh
    Nov 10 at 22:06













0












0








0







I am using XRegexP to parse a text file specifically to find the contents between two sets of pre-defined comment tags, I'm not able to change these tags so I need to find a way to make it work with the text provided.



I find a list of all of the tags using the regex provided (example in link also includes sample content): https://regex101.com/r/kCwyok/1/



I've then used XRegexP's matchRecursive function to get all the content in between the opening and closing tags which all works - almost - perfectly.



// Map the list of component tags and extract data from them
return generateComponentList(data).map((component) =>
console.log(chalk.blue('Processing', component[1], 'component.'))
const contents = XRegExp.matchRecursive(data, '<!-- @\[' + component[1] + '\][.\w-_+]* -->', '<!-- @\[/' + component[1] + '\] -->', 'g')
let body = ''
let classes = ''

contents.map((content) =>
const filteredContent = filterContent(content)
body = filteredContent.value
classes = cleanClasses(component[2])
console.log(chalk.green(component[1], 'processing complete.'))
)

// Output the content as a JSON object
return
componentName: component[1],
classes,
body

)


The problem I have is that the CodeExample tag exists twice, the tag is identical but the content is different, however, because matchRecursive doesn't appear to have a callback function, it just runs the match on all instances of that component at the same time so it doesn't matter if there are 1 or 10 instances of CodeExample the content for all of them is returned.



Is there a way I CAN actually add some sort of callback to matchRecursive? Failing that is there a way I can make JavaScript understand which instance of CodeExample is being looked at so I can just reference the array position directly? I presume XRegexP has an idea of which number CodeExample tag it's looking at, so is there a way to capture it?



Here is the full code for sake of clarity: https://pastebin.com/2MpdvdNA



The desired output I want is a JSON file with the following data:



[

componentName: "hero",
classes: "",
body: "# Creating new contexts"
,

componentName: "CodeExample",
classes: "",
body: "## Usage example

```javascript
Import ICON_NAME from 'Icons'
```"
,

componentName: "ArticleSection",
classes: "",
body: // This section is massive and not relevant to question so skipping
,

componentName: "NoteBlock",
classes: ["warning"],
body: "> #### Be Careful
> Eu laboris eiusmod ut exercitation minim laboris ipsum magna consectetur est [commodo](/nope)."
,

componentName: "CodeExample",
classes: "",
body: "#### Code example
```javascript
class ScrollingList extends React.Component
constructor(props)
super(props);
this.listRef = React.createRef();


render()
return (
<div ref=this.listRef>/* ...contents... */</div>
);


```"

// Skipping the rest as not relevant to question
]


Sorry if I've not explained this clearly, I've been looking at this for far too long.










share|improve this question















I am using XRegexP to parse a text file specifically to find the contents between two sets of pre-defined comment tags, I'm not able to change these tags so I need to find a way to make it work with the text provided.



I find a list of all of the tags using the regex provided (example in link also includes sample content): https://regex101.com/r/kCwyok/1/



I've then used XRegexP's matchRecursive function to get all the content in between the opening and closing tags which all works - almost - perfectly.



// Map the list of component tags and extract data from them
return generateComponentList(data).map((component) =>
console.log(chalk.blue('Processing', component[1], 'component.'))
const contents = XRegExp.matchRecursive(data, '<!-- @\[' + component[1] + '\][.\w-_+]* -->', '<!-- @\[/' + component[1] + '\] -->', 'g')
let body = ''
let classes = ''

contents.map((content) =>
const filteredContent = filterContent(content)
body = filteredContent.value
classes = cleanClasses(component[2])
console.log(chalk.green(component[1], 'processing complete.'))
)

// Output the content as a JSON object
return
componentName: component[1],
classes,
body

)


The problem I have is that the CodeExample tag exists twice, the tag is identical but the content is different, however, because matchRecursive doesn't appear to have a callback function, it just runs the match on all instances of that component at the same time so it doesn't matter if there are 1 or 10 instances of CodeExample the content for all of them is returned.



Is there a way I CAN actually add some sort of callback to matchRecursive? Failing that is there a way I can make JavaScript understand which instance of CodeExample is being looked at so I can just reference the array position directly? I presume XRegexP has an idea of which number CodeExample tag it's looking at, so is there a way to capture it?



Here is the full code for sake of clarity: https://pastebin.com/2MpdvdNA



The desired output I want is a JSON file with the following data:



[

componentName: "hero",
classes: "",
body: "# Creating new contexts"
,

componentName: "CodeExample",
classes: "",
body: "## Usage example

```javascript
Import ICON_NAME from 'Icons'
```"
,

componentName: "ArticleSection",
classes: "",
body: // This section is massive and not relevant to question so skipping
,

componentName: "NoteBlock",
classes: ["warning"],
body: "> #### Be Careful
> Eu laboris eiusmod ut exercitation minim laboris ipsum magna consectetur est [commodo](/nope)."
,

componentName: "CodeExample",
classes: "",
body: "#### Code example
```javascript
class ScrollingList extends React.Component
constructor(props)
super(props);
this.listRef = React.createRef();


render()
return (
<div ref=this.listRef>/* ...contents... */</div>
);


```"

// Skipping the rest as not relevant to question
]


Sorry if I've not explained this clearly, I've been looking at this for far too long.







javascript regex xregexp






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 8:34

























asked Nov 10 at 21:44









Alex Foxleigh

5961422




5961422











  • There are a lot of variables and functions in the code that haven't been defined or explained. Assuming that the text in the regex101 is your input (as well as the tag names in component), can you post your desired output?
    – CertainPerformance
    Nov 10 at 21:58










  • I've updated the question with some extra details, hope that helps :)
    – Alex Foxleigh
    Nov 10 at 22:06
















  • There are a lot of variables and functions in the code that haven't been defined or explained. Assuming that the text in the regex101 is your input (as well as the tag names in component), can you post your desired output?
    – CertainPerformance
    Nov 10 at 21:58










  • I've updated the question with some extra details, hope that helps :)
    – Alex Foxleigh
    Nov 10 at 22:06















There are a lot of variables and functions in the code that haven't been defined or explained. Assuming that the text in the regex101 is your input (as well as the tag names in component), can you post your desired output?
– CertainPerformance
Nov 10 at 21:58




There are a lot of variables and functions in the code that haven't been defined or explained. Assuming that the text in the regex101 is your input (as well as the tag names in component), can you post your desired output?
– CertainPerformance
Nov 10 at 21:58












I've updated the question with some extra details, hope that helps :)
– Alex Foxleigh
Nov 10 at 22:06




I've updated the question with some extra details, hope that helps :)
– Alex Foxleigh
Nov 10 at 22:06












1 Answer
1






active

oldest

votes


















0














This is how it was resolved in the end:



import XRegExp from 'xregexp'

const extractComponents = data =>
const components =
const re = '<!-- @\[(\w+)\]([.\w-_+]+)* -->'

XRegExp.forEach(data, XRegExp(re, 'g'), match =>
const name = match[1]
const classes = match[2]

const count = components.filter(item => item.name === name).length
const instance = count ? count : 0

components.push(
name,
classes,
instance
)
)

return components


const cleanClasses = classes =>
const filteredClasses = classes ? classes.split('.') :
filteredClasses.shift()

return filteredClasses


const extractContent = (data, component) =>
const re = `<!-- @\[$component.name\][.\w-_+]* -->`
const re2 = `<!-- @\[/$component.name\] -->`

return XRegExp.matchRecursive(
data,
re, re2, 'g'
)[component.instance]


const parseComponents = data =>
return extractComponents(data).map(component =>
return
componentName: component.name,
classes: cleanClasses(component.classes),
body: extractContent(data, component)

)


export default parseComponents





share|improve this answer




















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53243715%2fxregexp-matchrecursive-add-callback-functionality-to-allow-for-multiple-identi%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    This is how it was resolved in the end:



    import XRegExp from 'xregexp'

    const extractComponents = data =>
    const components =
    const re = '<!-- @\[(\w+)\]([.\w-_+]+)* -->'

    XRegExp.forEach(data, XRegExp(re, 'g'), match =>
    const name = match[1]
    const classes = match[2]

    const count = components.filter(item => item.name === name).length
    const instance = count ? count : 0

    components.push(
    name,
    classes,
    instance
    )
    )

    return components


    const cleanClasses = classes =>
    const filteredClasses = classes ? classes.split('.') :
    filteredClasses.shift()

    return filteredClasses


    const extractContent = (data, component) =>
    const re = `<!-- @\[$component.name\][.\w-_+]* -->`
    const re2 = `<!-- @\[/$component.name\] -->`

    return XRegExp.matchRecursive(
    data,
    re, re2, 'g'
    )[component.instance]


    const parseComponents = data =>
    return extractComponents(data).map(component =>
    return
    componentName: component.name,
    classes: cleanClasses(component.classes),
    body: extractContent(data, component)

    )


    export default parseComponents





    share|improve this answer

























      0














      This is how it was resolved in the end:



      import XRegExp from 'xregexp'

      const extractComponents = data =>
      const components =
      const re = '<!-- @\[(\w+)\]([.\w-_+]+)* -->'

      XRegExp.forEach(data, XRegExp(re, 'g'), match =>
      const name = match[1]
      const classes = match[2]

      const count = components.filter(item => item.name === name).length
      const instance = count ? count : 0

      components.push(
      name,
      classes,
      instance
      )
      )

      return components


      const cleanClasses = classes =>
      const filteredClasses = classes ? classes.split('.') :
      filteredClasses.shift()

      return filteredClasses


      const extractContent = (data, component) =>
      const re = `<!-- @\[$component.name\][.\w-_+]* -->`
      const re2 = `<!-- @\[/$component.name\] -->`

      return XRegExp.matchRecursive(
      data,
      re, re2, 'g'
      )[component.instance]


      const parseComponents = data =>
      return extractComponents(data).map(component =>
      return
      componentName: component.name,
      classes: cleanClasses(component.classes),
      body: extractContent(data, component)

      )


      export default parseComponents





      share|improve this answer























        0












        0








        0






        This is how it was resolved in the end:



        import XRegExp from 'xregexp'

        const extractComponents = data =>
        const components =
        const re = '<!-- @\[(\w+)\]([.\w-_+]+)* -->'

        XRegExp.forEach(data, XRegExp(re, 'g'), match =>
        const name = match[1]
        const classes = match[2]

        const count = components.filter(item => item.name === name).length
        const instance = count ? count : 0

        components.push(
        name,
        classes,
        instance
        )
        )

        return components


        const cleanClasses = classes =>
        const filteredClasses = classes ? classes.split('.') :
        filteredClasses.shift()

        return filteredClasses


        const extractContent = (data, component) =>
        const re = `<!-- @\[$component.name\][.\w-_+]* -->`
        const re2 = `<!-- @\[/$component.name\] -->`

        return XRegExp.matchRecursive(
        data,
        re, re2, 'g'
        )[component.instance]


        const parseComponents = data =>
        return extractComponents(data).map(component =>
        return
        componentName: component.name,
        classes: cleanClasses(component.classes),
        body: extractContent(data, component)

        )


        export default parseComponents





        share|improve this answer












        This is how it was resolved in the end:



        import XRegExp from 'xregexp'

        const extractComponents = data =>
        const components =
        const re = '<!-- @\[(\w+)\]([.\w-_+]+)* -->'

        XRegExp.forEach(data, XRegExp(re, 'g'), match =>
        const name = match[1]
        const classes = match[2]

        const count = components.filter(item => item.name === name).length
        const instance = count ? count : 0

        components.push(
        name,
        classes,
        instance
        )
        )

        return components


        const cleanClasses = classes =>
        const filteredClasses = classes ? classes.split('.') :
        filteredClasses.shift()

        return filteredClasses


        const extractContent = (data, component) =>
        const re = `<!-- @\[$component.name\][.\w-_+]* -->`
        const re2 = `<!-- @\[/$component.name\] -->`

        return XRegExp.matchRecursive(
        data,
        re, re2, 'g'
        )[component.instance]


        const parseComponents = data =>
        return extractComponents(data).map(component =>
        return
        componentName: component.name,
        classes: cleanClasses(component.classes),
        body: extractContent(data, component)

        )


        export default parseComponents






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 12 at 14:01









        Alex Foxleigh

        5961422




        5961422



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid …


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid …


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53243715%2fxregexp-matchrecursive-add-callback-functionality-to-allow-for-multiple-identi%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Top Tejano songwriter Luis Silva dead of heart attack at 64

            政党

            天津地下鉄3号線