Extract a certain string which can appear several times in a file [closed]
up vote
0
down vote
favorite
I have a text file that I want to read and extract a certain string (which can appear several times). Then I want to print the result.
The string I'm trying to extract is the value of Rule MATCH Name.
Text file example:
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test1 SUBSCORE:100
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test2 SUBSCORE:90
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test3 SUBSCORE:15
python text-manipulation
closed as too broad by stovfl, EdChum, GhostCat, Devon_C_Miller, lagom Nov 12 at 1:55
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
add a comment |
up vote
0
down vote
favorite
I have a text file that I want to read and extract a certain string (which can appear several times). Then I want to print the result.
The string I'm trying to extract is the value of Rule MATCH Name.
Text file example:
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test1 SUBSCORE:100
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test2 SUBSCORE:90
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test3 SUBSCORE:15
python text-manipulation
closed as too broad by stovfl, EdChum, GhostCat, Devon_C_Miller, lagom Nov 12 at 1:55
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
4
StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
– quant
Nov 11 at 9:51
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a text file that I want to read and extract a certain string (which can appear several times). Then I want to print the result.
The string I'm trying to extract is the value of Rule MATCH Name.
Text file example:
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test1 SUBSCORE:100
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test2 SUBSCORE:90
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test3 SUBSCORE:15
python text-manipulation
I have a text file that I want to read and extract a certain string (which can appear several times). Then I want to print the result.
The string I'm trying to extract is the value of Rule MATCH Name.
Text file example:
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test1 SUBSCORE:100
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test2 SUBSCORE:90
201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test
201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: **Rule MATCH Name**: this_is_test3 SUBSCORE:15
python text-manipulation
python text-manipulation
edited Nov 11 at 11:26
karel
1,58132025
1,58132025
asked Nov 11 at 9:49
bugnet17
325
325
closed as too broad by stovfl, EdChum, GhostCat, Devon_C_Miller, lagom Nov 12 at 1:55
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
closed as too broad by stovfl, EdChum, GhostCat, Devon_C_Miller, lagom Nov 12 at 1:55
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
4
StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
– quant
Nov 11 at 9:51
add a comment |
4
StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
– quant
Nov 11 at 9:51
4
4
StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
– quant
Nov 11 at 9:51
StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
– quant
Nov 11 at 9:51
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
accepted
You can use regex to solve this problem. Regexr is a great website to create and test regex rules.
Once you have a rule that fits your problem, load the file, use readlines() to get the text, and use python's re module to extract the values.
I made a quick solution(not sure if this is the value you are trying to extract):
import re
fl = r'201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test1 SUBSCORE:100 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test2 SUBSCORE:90 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test3 SUBSCORE:15'
re.findall(r'Rule MATCH Name:s(w+)s', fl)
# ['this_is_test1', 'this_is_test2', 'this_is_test3']
If reading from a file:
import re
with open('f.txt') as f:
found =
for line in f.readlines():
found += re.findall(r'Rule MATCH Name:s(w+)s', line)
print(found) # ['this_is_test1', 'this_is_test2', 'this_is_test3']
According to your example, how do I read from a file and then print the results?
– bugnet17
Nov 11 at 11:19
@bugnet17 Added an example with reading from a file
– Dani G
Nov 11 at 12:09
add a comment |
up vote
0
down vote
It is pretty easy with a method called "search", please follow the pseudo code:
import re
import sys
file = open(sys.argv[2], "r")
for line in file:
if re.search(sys.argv[1], line):
print line,
It prints all line. I need only the value of Rule MATCH Name..
– bugnet17
Nov 11 at 11:30
Do you need the count? As printing the string multiple times won't be a good idea.
– swapnil shashank
Nov 11 at 11:33
No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
– bugnet17
Nov 11 at 11:37
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
You can use regex to solve this problem. Regexr is a great website to create and test regex rules.
Once you have a rule that fits your problem, load the file, use readlines() to get the text, and use python's re module to extract the values.
I made a quick solution(not sure if this is the value you are trying to extract):
import re
fl = r'201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test1 SUBSCORE:100 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test2 SUBSCORE:90 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test3 SUBSCORE:15'
re.findall(r'Rule MATCH Name:s(w+)s', fl)
# ['this_is_test1', 'this_is_test2', 'this_is_test3']
If reading from a file:
import re
with open('f.txt') as f:
found =
for line in f.readlines():
found += re.findall(r'Rule MATCH Name:s(w+)s', line)
print(found) # ['this_is_test1', 'this_is_test2', 'this_is_test3']
According to your example, how do I read from a file and then print the results?
– bugnet17
Nov 11 at 11:19
@bugnet17 Added an example with reading from a file
– Dani G
Nov 11 at 12:09
add a comment |
up vote
0
down vote
accepted
You can use regex to solve this problem. Regexr is a great website to create and test regex rules.
Once you have a rule that fits your problem, load the file, use readlines() to get the text, and use python's re module to extract the values.
I made a quick solution(not sure if this is the value you are trying to extract):
import re
fl = r'201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test1 SUBSCORE:100 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test2 SUBSCORE:90 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test3 SUBSCORE:15'
re.findall(r'Rule MATCH Name:s(w+)s', fl)
# ['this_is_test1', 'this_is_test2', 'this_is_test3']
If reading from a file:
import re
with open('f.txt') as f:
found =
for line in f.readlines():
found += re.findall(r'Rule MATCH Name:s(w+)s', line)
print(found) # ['this_is_test1', 'this_is_test2', 'this_is_test3']
According to your example, how do I read from a file and then print the results?
– bugnet17
Nov 11 at 11:19
@bugnet17 Added an example with reading from a file
– Dani G
Nov 11 at 12:09
add a comment |
up vote
0
down vote
accepted
up vote
0
down vote
accepted
You can use regex to solve this problem. Regexr is a great website to create and test regex rules.
Once you have a rule that fits your problem, load the file, use readlines() to get the text, and use python's re module to extract the values.
I made a quick solution(not sure if this is the value you are trying to extract):
import re
fl = r'201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test1 SUBSCORE:100 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test2 SUBSCORE:90 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test3 SUBSCORE:15'
re.findall(r'Rule MATCH Name:s(w+)s', fl)
# ['this_is_test1', 'this_is_test2', 'this_is_test3']
If reading from a file:
import re
with open('f.txt') as f:
found =
for line in f.readlines():
found += re.findall(r'Rule MATCH Name:s(w+)s', line)
print(found) # ['this_is_test1', 'this_is_test2', 'this_is_test3']
You can use regex to solve this problem. Regexr is a great website to create and test regex rules.
Once you have a rule that fits your problem, load the file, use readlines() to get the text, and use python's re module to extract the values.
I made a quick solution(not sure if this is the value you are trying to extract):
import re
fl = r'201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/76.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test1 SUBSCORE:100 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/7164.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test2 SUBSCORE:90 201819:34:40Z ubuntu : Info: MODULE: FileScan MESSAGE: Scanning test 201809:34:40Z ubuntu: Alert: MODULE: FileScan MESSAGE: FILE: /test/764.bin SCORE: 140 TYPE: EXE AutoUpdates https://www.test.com/files: Rule MATCH Name: this_is_test3 SUBSCORE:15'
re.findall(r'Rule MATCH Name:s(w+)s', fl)
# ['this_is_test1', 'this_is_test2', 'this_is_test3']
If reading from a file:
import re
with open('f.txt') as f:
found =
for line in f.readlines():
found += re.findall(r'Rule MATCH Name:s(w+)s', line)
print(found) # ['this_is_test1', 'this_is_test2', 'this_is_test3']
edited Nov 11 at 12:09
answered Nov 11 at 10:03
Dani G
427411
427411
According to your example, how do I read from a file and then print the results?
– bugnet17
Nov 11 at 11:19
@bugnet17 Added an example with reading from a file
– Dani G
Nov 11 at 12:09
add a comment |
According to your example, how do I read from a file and then print the results?
– bugnet17
Nov 11 at 11:19
@bugnet17 Added an example with reading from a file
– Dani G
Nov 11 at 12:09
According to your example, how do I read from a file and then print the results?
– bugnet17
Nov 11 at 11:19
According to your example, how do I read from a file and then print the results?
– bugnet17
Nov 11 at 11:19
@bugnet17 Added an example with reading from a file
– Dani G
Nov 11 at 12:09
@bugnet17 Added an example with reading from a file
– Dani G
Nov 11 at 12:09
add a comment |
up vote
0
down vote
It is pretty easy with a method called "search", please follow the pseudo code:
import re
import sys
file = open(sys.argv[2], "r")
for line in file:
if re.search(sys.argv[1], line):
print line,
It prints all line. I need only the value of Rule MATCH Name..
– bugnet17
Nov 11 at 11:30
Do you need the count? As printing the string multiple times won't be a good idea.
– swapnil shashank
Nov 11 at 11:33
No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
– bugnet17
Nov 11 at 11:37
add a comment |
up vote
0
down vote
It is pretty easy with a method called "search", please follow the pseudo code:
import re
import sys
file = open(sys.argv[2], "r")
for line in file:
if re.search(sys.argv[1], line):
print line,
It prints all line. I need only the value of Rule MATCH Name..
– bugnet17
Nov 11 at 11:30
Do you need the count? As printing the string multiple times won't be a good idea.
– swapnil shashank
Nov 11 at 11:33
No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
– bugnet17
Nov 11 at 11:37
add a comment |
up vote
0
down vote
up vote
0
down vote
It is pretty easy with a method called "search", please follow the pseudo code:
import re
import sys
file = open(sys.argv[2], "r")
for line in file:
if re.search(sys.argv[1], line):
print line,
It is pretty easy with a method called "search", please follow the pseudo code:
import re
import sys
file = open(sys.argv[2], "r")
for line in file:
if re.search(sys.argv[1], line):
print line,
answered Nov 11 at 11:07
swapnil shashank
625
625
It prints all line. I need only the value of Rule MATCH Name..
– bugnet17
Nov 11 at 11:30
Do you need the count? As printing the string multiple times won't be a good idea.
– swapnil shashank
Nov 11 at 11:33
No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
– bugnet17
Nov 11 at 11:37
add a comment |
It prints all line. I need only the value of Rule MATCH Name..
– bugnet17
Nov 11 at 11:30
Do you need the count? As printing the string multiple times won't be a good idea.
– swapnil shashank
Nov 11 at 11:33
No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
– bugnet17
Nov 11 at 11:37
It prints all line. I need only the value of Rule MATCH Name..
– bugnet17
Nov 11 at 11:30
It prints all line. I need only the value of Rule MATCH Name..
– bugnet17
Nov 11 at 11:30
Do you need the count? As printing the string multiple times won't be a good idea.
– swapnil shashank
Nov 11 at 11:33
Do you need the count? As printing the string multiple times won't be a good idea.
– swapnil shashank
Nov 11 at 11:33
No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
– bugnet17
Nov 11 at 11:37
No.. I need the value of "rule match name". for example: Rule MATCH Name: this_is_test1 I'm trying to extract the "this_is_test1"
– bugnet17
Nov 11 at 11:37
add a comment |
4
StackOverflow expects you to try to solve your own problem first, as your attempts help us to better understand what you want. Please edit the question to show what you've tried, so as to illustrate a specific problem you're having in a Minimal, Complete, and Verifiable example. For more information, please see How to Ask and take the Tour.
– quant
Nov 11 at 9:51