Extract a certain string which can appear several times in a file [closed]









up vote
0
down vote

favorite












I have a text file that I want to read and extract a certain string (which can appear several times). Then I want to print the result.



The string I'm trying to extract is the value of Rule MATCH Name.



Text file example:



201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test1 SUBSCORE:100
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test2 SUBSCORE:90
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test3 SUBSCORE:15









share|improve this question















closed as too broad by stovfl, EdChum, GhostCat, Devon_C_Miller, lagom Nov 12 at 1:55


Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.










  • 4




    StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
    – quant
    Nov 11 at 9:51















up vote
0
down vote

favorite












I have a text file that I want to read and extract a certain string (which can appear several times). Then I want to print the result.



The string I'm trying to extract is the value of Rule MATCH Name.



Text file example:



201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test1 SUBSCORE:100
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test2 SUBSCORE:90
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test3 SUBSCORE:15









share|improve this question















closed as too broad by stovfl, EdChum, GhostCat, Devon_C_Miller, lagom Nov 12 at 1:55


Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.










  • 4




    StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
    – quant
    Nov 11 at 9:51













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I have a text file that I want to read and extract a certain string (which can appear several times). Then I want to print the result.



The string I'm trying to extract is the value of Rule MATCH Name.



Text file example:



201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test1 SUBSCORE:100
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test2 SUBSCORE:90
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test3 SUBSCORE:15









share|improve this question















I have a text file that I want to read and extract a certain string (which can appear several times). Then I want to print the result.



The string I'm trying to extract is the value of Rule MATCH Name.



Text file example:



201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test1 SUBSCORE:100
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test2 SUBSCORE:90
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test3 SUBSCORE:15






python text-manipulation






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 11:26









karel

1,58132025




1,58132025










asked Nov 11 at 9:49









bugnet17

325




325




closed as too broad by stovfl, EdChum, GhostCat, Devon_C_Miller, lagom Nov 12 at 1:55


Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.






closed as too broad by stovfl, EdChum, GhostCat, Devon_C_Miller, lagom Nov 12 at 1:55


Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.









  • 4




    StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
    – quant
    Nov 11 at 9:51













  • 4




    StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
    – quant
    Nov 11 at 9:51








4




4




StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
– quant
Nov 11 at 9:51





StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
– quant
Nov 11 at 9:51













2 Answers
2






active

oldest

votes

















up vote
0
down vote



accepted










You can use regex to solve this problem. Regexr is a great website to create and test regex rules.
Once you have a rule that fits your problem, load the file, use readlines() to get the text, and use python's re module to extract the values.



I made a quick solution(not sure if this is the value you are trying to extract):



import re
fl = r'201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test1 SUBSCORE:100 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test2 SUBSCORE:90 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test3 SUBSCORE:15'

re.findall(r'Rule MATCH Name:s(w+)s', fl)
# ['this_is_test1', 'this_is_test2', 'this_is_test3']


If reading from a file:



import re
with open('f.txt') as f:
found =
for line in f.readlines():
found += re.findall(r'Rule MATCH Name:s(w+)s', line)
print(found) # ['this_is_test1', 'this_is_test2', 'this_is_test3']





share|improve this answer






















  • According to your example, how do I read from a file and then print the results?
    – bugnet17
    Nov 11 at 11:19










  • @bugnet17 Added an example with reading from a file
    – Dani G
    Nov 11 at 12:09

















up vote
0
down vote













It is pretty easy with a method called "search", please follow the pseudo code:



import re
import sys
file = open(sys.argv[2], "r")

for line in file:
if re.search(sys.argv[1], line):
print line,





share|improve this answer




















  • It prints all line. I need only the value of Rule MATCH Name..
    – bugnet17
    Nov 11 at 11:30










  • Do you need the count? As printing the string multiple times won't be a good idea.
    – swapnil shashank
    Nov 11 at 11:33










  • No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
    – bugnet17
    Nov 11 at 11:37


















2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
0
down vote



accepted










You can use regex to solve this problem. Regexr is a great website to create and test regex rules.
Once you have a rule that fits your problem, load the file, use readlines() to get the text, and use python's re module to extract the values.



I made a quick solution(not sure if this is the value you are trying to extract):



import re
fl = r'201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test1 SUBSCORE:100 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test2 SUBSCORE:90 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test3 SUBSCORE:15'

re.findall(r'Rule MATCH Name:s(w+)s', fl)
# ['this_is_test1', 'this_is_test2', 'this_is_test3']


If reading from a file:



import re
with open('f.txt') as f:
found =
for line in f.readlines():
found += re.findall(r'Rule MATCH Name:s(w+)s', line)
print(found) # ['this_is_test1', 'this_is_test2', 'this_is_test3']





share|improve this answer






















  • According to your example, how do I read from a file and then print the results?
    – bugnet17
    Nov 11 at 11:19










  • @bugnet17 Added an example with reading from a file
    – Dani G
    Nov 11 at 12:09














up vote
0
down vote



accepted










You can use regex to solve this problem. Regexr is a great website to create and test regex rules.
Once you have a rule that fits your problem, load the file, use readlines() to get the text, and use python's re module to extract the values.



I made a quick solution(not sure if this is the value you are trying to extract):



import re
fl = r'201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test1 SUBSCORE:100 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test2 SUBSCORE:90 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test3 SUBSCORE:15'

re.findall(r'Rule MATCH Name:s(w+)s', fl)
# ['this_is_test1', 'this_is_test2', 'this_is_test3']


If reading from a file:



import re
with open('f.txt') as f:
found =
for line in f.readlines():
found += re.findall(r'Rule MATCH Name:s(w+)s', line)
print(found) # ['this_is_test1', 'this_is_test2', 'this_is_test3']





share|improve this answer






















  • According to your example, how do I read from a file and then print the results?
    – bugnet17
    Nov 11 at 11:19










  • @bugnet17 Added an example with reading from a file
    – Dani G
    Nov 11 at 12:09












up vote
0
down vote



accepted







up vote
0
down vote



accepted






You can use regex to solve this problem. Regexr is a great website to create and test regex rules.
Once you have a rule that fits your problem, load the file, use readlines() to get the text, and use python's re module to extract the values.



I made a quick solution(not sure if this is the value you are trying to extract):



import re
fl = r'201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test1 SUBSCORE:100 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test2 SUBSCORE:90 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test3 SUBSCORE:15'

re.findall(r'Rule MATCH Name:s(w+)s', fl)
# ['this_is_test1', 'this_is_test2', 'this_is_test3']


If reading from a file:



import re
with open('f.txt') as f:
found =
for line in f.readlines():
found += re.findall(r'Rule MATCH Name:s(w+)s', line)
print(found) # ['this_is_test1', 'this_is_test2', 'this_is_test3']





share|improve this answer














You can use regex to solve this problem. Regexr is a great website to create and test regex rules.
Once you have a rule that fits your problem, load the file, use readlines() to get the text, and use python's re module to extract the values.



I made a quick solution(not sure if this is the value you are trying to extract):



import re
fl = r'201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test1 SUBSCORE:100 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test2 SUBSCORE:90 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test3 SUBSCORE:15'

re.findall(r'Rule MATCH Name:s(w+)s', fl)
# ['this_is_test1', 'this_is_test2', 'this_is_test3']


If reading from a file:



import re
with open('f.txt') as f:
found =
for line in f.readlines():
found += re.findall(r'Rule MATCH Name:s(w+)s', line)
print(found) # ['this_is_test1', 'this_is_test2', 'this_is_test3']






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 11 at 12:09

























answered Nov 11 at 10:03









Dani G

427411




427411











  • According to your example, how do I read from a file and then print the results?
    – bugnet17
    Nov 11 at 11:19










  • @bugnet17 Added an example with reading from a file
    – Dani G
    Nov 11 at 12:09
















  • According to your example, how do I read from a file and then print the results?
    – bugnet17
    Nov 11 at 11:19










  • @bugnet17 Added an example with reading from a file
    – Dani G
    Nov 11 at 12:09















According to your example, how do I read from a file and then print the results?
– bugnet17
Nov 11 at 11:19




According to your example, how do I read from a file and then print the results?
– bugnet17
Nov 11 at 11:19












@bugnet17 Added an example with reading from a file
– Dani G
Nov 11 at 12:09




@bugnet17 Added an example with reading from a file
– Dani G
Nov 11 at 12:09












up vote
0
down vote













It is pretty easy with a method called "search", please follow the pseudo code:



import re
import sys
file = open(sys.argv[2], "r")

for line in file:
if re.search(sys.argv[1], line):
print line,





share|improve this answer




















  • It prints all line. I need only the value of Rule MATCH Name..
    – bugnet17
    Nov 11 at 11:30










  • Do you need the count? As printing the string multiple times won't be a good idea.
    – swapnil shashank
    Nov 11 at 11:33










  • No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
    – bugnet17
    Nov 11 at 11:37















up vote
0
down vote













It is pretty easy with a method called "search", please follow the pseudo code:



import re
import sys
file = open(sys.argv[2], "r")

for line in file:
if re.search(sys.argv[1], line):
print line,





share|improve this answer




















  • It prints all line. I need only the value of Rule MATCH Name..
    – bugnet17
    Nov 11 at 11:30










  • Do you need the count? As printing the string multiple times won't be a good idea.
    – swapnil shashank
    Nov 11 at 11:33










  • No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
    – bugnet17
    Nov 11 at 11:37













up vote
0
down vote










up vote
0
down vote









It is pretty easy with a method called "search", please follow the pseudo code:



import re
import sys
file = open(sys.argv[2], "r")

for line in file:
if re.search(sys.argv[1], line):
print line,





share|improve this answer












It is pretty easy with a method called "search", please follow the pseudo code:



import re
import sys
file = open(sys.argv[2], "r")

for line in file:
if re.search(sys.argv[1], line):
print line,






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 11 at 11:07









swapnil shashank

625




625











  • It prints all line. I need only the value of Rule MATCH Name..
    – bugnet17
    Nov 11 at 11:30










  • Do you need the count? As printing the string multiple times won't be a good idea.
    – swapnil shashank
    Nov 11 at 11:33










  • No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
    – bugnet17
    Nov 11 at 11:37

















  • It prints all line. I need only the value of Rule MATCH Name..
    – bugnet17
    Nov 11 at 11:30










  • Do you need the count? As printing the string multiple times won't be a good idea.
    – swapnil shashank
    Nov 11 at 11:33










  • No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
    – bugnet17
    Nov 11 at 11:37
















It prints all line. I need only the value of Rule MATCH Name..
– bugnet17
Nov 11 at 11:30




It prints all line. I need only the value of Rule MATCH Name..
– bugnet17
Nov 11 at 11:30












Do you need the count? As printing the string multiple times won't be a good idea.
– swapnil shashank
Nov 11 at 11:33




Do you need the count? As printing the string multiple times won't be a good idea.
– swapnil shashank
Nov 11 at 11:33












No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
– bugnet17
Nov 11 at 11:37





No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
– bugnet17
Nov 11 at 11:37




Popular posts from this blog

Top Tejano songwriter Luis Silva dead of heart attack at 64

政党

天津地下鉄3号線