Skip to content

Commit

Permalink
Update contributor list
Browse files Browse the repository at this point in the history
  • Loading branch information
zhujiem committed Jan 28, 2024
1 parent 7e6f6cd commit 89c2832
Show file tree
Hide file tree
Showing 10 changed files with 51 additions and 51 deletions.
22 changes: 22 additions & 0 deletions .github/workflows/contributor.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: Contributor List

on:
pull_request:
branches:
- main
types: [closed]
workflow_dispatch

jobs:
contrib-readme-job:
runs-on: ubuntu-latest
steps:
- name: Add contributor list
uses: akhilmhdh/contributors-readme-action@master
with:
readme_path: "README.md"
image_size: 60
commit_message: "Automatically update contributors"
columns_per_row: 8
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,12 +147,18 @@ The main goal of logparser is used for research and benchmark purpose. Researche
+ Please enhance logparser with efficiency and scalability with multi-processing, add failure recovery, add persistence to disk or message queue Kafka.
+ [Drain3](https://github.com/logpai/Drain3) provides a good example for your reference that is built with [practical enhancements](https://github.com/logpai/Drain3#new-features) for production scenarios.
### Citation
👋 If you use our logparser tools or benchmarking results in your publication, please cite the following papers.
### 🔥 Citation
If you use our logparser tools or benchmarking results in your publication, please cite the following papers.
+ [**ICSE'19**] Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu. [Tools and Benchmarks for Automated Log Parsing](https://arxiv.org/pdf/1811.03509.pdf). *International Conference on Software Engineering (ICSE)*, 2019.
+ [**DSN'16**] Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. [An Evaluation Study on Log Parsing and Its Use in Log Mining](https://jiemingzhu.github.io/pub/pjhe_dsn2016.pdf). *IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)*, 2016.
### 🤗 Contributors
<!-- readme: contributors -start -->
<!-- readme: contributors -end -->
### Discussion
Welcome to join our WeChat group for any question and discussion. Alternatively, you can [open an issue here](https://github.com/logpai/logparser/issues/new).
Expand Down
11 changes: 3 additions & 8 deletions logparser/Brain/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,8 @@
# Brain

### Abstract

Automated log analysis can facilitate failure diagnosis for developers and operators using a large volume of logs. Log parsing is a prerequisite step for automated log analysis, which parses semi-structured logs into structured logs. However, existing parsers are difficult to apply to software-intensive systems, due to their unstable parsing accuracy on various software. Although neural network-based approaches are stable, their inefficiency makes it challenging to keep up with the speed of log production.We found that a logging statement always generate the same template words, thus, the word with the most frequency in each log is more likely to be constant. However, the identical constant and variable generated from different logging statements may break this rule Inspired by this key insight, we propose a new stable log parsing approach, called Brain, which creates initial groups according to the longest common pattern. Then a bidirectional tree is used to hierarchically complement the constant words to the longest common pattern to form the complete log template efficiently. Experimental results on 16 benchmark datasets show that our approach outperforms the state-of-the-art parsers on two widely-used parsing accuracy metrics, and it only takes around 46 seconds to process one million lines of logs.

Read more information about Brain from the following papers:

+ Siyu Yu, Pinjia He, Ningjiang Chen, and Yifan Wu. [Brain: Log Parsing with Bidirectional Parallel Tree](https://ieeexplore.ieee.org/abstract/document/10109145), *IEEE Transactions on Service Computing*, 2023.


### Running

Expand Down Expand Up @@ -59,9 +54,9 @@ Running the benchmark script on Loghub_2k datasets, you could obtain the followi
| Mac | 0.995821 | 0.942 |


### Citation
### 🔥 Citation

:telescope: If you use our logparser tools or benchmarking results in your publication, please kindly cite the following papers.
If you use the code or benchmarking results in your publication, please kindly cite the following papers.

+ [**TSC'23**] Siyu Yu, Pinjia He, Ningjiang Chen, and Yifan Wu. [Brain: Log Parsing with Bidirectional Parallel Tree](https://ieeexplore.ieee.org/abstract/document/10109145), *IEEE Transactions on Service Computing*, 2023.
+ [**ICSE'19**] Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu. [Tools and Benchmarks for Automated Log Parsing](https://arxiv.org/pdf/1811.03509.pdf). *International Conference on Software Engineering (ICSE)*, 2019.
+ [**DSN'16**] Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. [An Evaluation Study on Log Parsing and Its Use in Log Mining](https://jiemingzhu.github.io/pub/pjhe_dsn2016.pdf). *IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)*, 2016.
13 changes: 3 additions & 10 deletions logparser/DivLog/README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,7 @@
# DivLog

### Abstract

DivLog is an online LLM-based log parsing framework via in-context learning. It supports various LLMs as engines through API for high-quality parsing results.

Read more information about DivLog from the following papers:

+ Junjielong Xu, Ruichun Yang, Yintong Huo, Chengyu Zhang, and Pinjia He. [DivLog: Log Parsing with Prompt Enhanced In-Context Learning](https://doi.org/10.1145/3597503.3639155). *In 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE’24)*


### Running

Install the required enviornment:
Expand Down Expand Up @@ -80,9 +73,9 @@ Running the benchmark script on Loghub_2k datasets, you could obtain the followi
| Hadoop | 0.9960 | 0.982609 | 0.991228 | 0.9940 |


### Citation
### 🔥 Citation

:telescope: If you use our logparser tools or benchmarking results in your publication, please kindly cite the following papers.
If you use the code or benchmarking results in your publication, please kindly cite the following papers.

+ [**ICSE'24**] Junjielong Xu, Ruichun Yang, Yintong Huo, Chengyu Zhang, and Pinjia He. [DivLog: Log Parsing with Prompt Enhanced In-Context Learning](https://doi.org/10.1145/3597503.3639155). *IEEE/ACM 46th International Conference on Software Engineering (ICSE)*, 2024.
+ [**ICSE'19**] Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu. [Tools and Benchmarks for Automated Log Parsing](https://arxiv.org/pdf/1811.03509.pdf). *International Conference on Software Engineering (ICSE)*, 2019.
+ [**DSN'16**] Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. [An Evaluation Study on Log Parsing and Its Use in Log Mining](https://jiemingzhu.github.io/pub/pjhe_dsn2016.pdf). *IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)*, 2016.
19 changes: 6 additions & 13 deletions logparser/Drain/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,6 @@

Drain is an online log parser that can parse logs into structured events in a streaming and timely manner. It employs a parse tree with fixed depth to guide the log group search process, which effectively avoids constructing a very deep and unbalanced tree.

Read more information about Drain from the following paper:

+ Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R. Lyu. [Drain: An Online Log Parsing Approach with Fixed Depth Tree](http://jiemingzhu.github.io/pub/pjhe_icws2017.pdf), *Proceedings of the 24th International Conference on Web Services (ICWS)*, 2017.


### Running

The code has been tested in the following enviornment:
Expand Down Expand Up @@ -51,15 +46,13 @@ Running the benchmark script on Loghub_2k datasets, you could obtain the followi
| OpenStack | 0.992536 | 0.7325 |
| Mac | 0.975451 | 0.7865 |

### Industrial Adoption

### Citation

:telescope: If you use our logparser tools or benchmarking results in your publication, please kindly cite the following papers.

+ [**ICSE'19**] Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu. [Tools and Benchmarks for Automated Log Parsing](https://arxiv.org/pdf/1811.03509.pdf). *International Conference on Software Engineering (ICSE)*, 2019.
+ [**DSN'16**] Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. [An Evaluation Study on Log Parsing and Its Use in Log Mining](https://jiemingzhu.github.io/pub/pjhe_dsn2016.pdf). *IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)*, 2016.
Researchers from IBM ([@davidohana](https://github.com/davidohana)) made an upgrade version of Drain with additional features for production use: [https://github.com/logpai/Drain3](https://github.com/logpai/Drain3).

### 🔥 Citation

### Industrial Adoption
If you use the code or benchmarking results in your publication, please kindly cite the following papers.

Researchers from IBM ([@davidohana](https://github.com/davidohana)) made an upgrade version of Drain with additional features for production use: [https://github.com/logpai/Drain3](https://github.com/logpai/Drain3).
+ [**ICWS'17**] Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R. Lyu. [Drain: An Online Log Parsing Approach with Fixed Depth Tree](http://jiemingzhu.github.io/pub/pjhe_icws2017.pdf), *Proceedings of the 24th International Conference on Web Services (ICWS)*, 2017.
+ [**ICSE'19**] Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu. [Tools and Benchmarks for Automated Log Parsing](https://arxiv.org/pdf/1811.03509.pdf). *International Conference on Software Engineering (ICSE)*, 2019.
10 changes: 3 additions & 7 deletions logparser/NuLog/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,6 @@

Parsing semi-structured records with free-form text log messages into structured templates is the first and crucial step that enables further analysis. NuLog presents a novel parsing technique that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling (MLM). In the process of parsing, the model extracts summarizations from the logs in the form of a vector embedding. This allows the coupling of the MLM as pre-training with a downstream anomaly detection task.

Read more information about Brain from the following papers:

+ Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, Odej Kao. [Self-Supervised Log Parsing](https://arxiv.org/abs/2003.07905), *Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD)*, 2020.


### Running

Expand Down Expand Up @@ -46,9 +42,9 @@ Running the benchmark script on Loghub_2k datasets, you could obtain the followi
| Mac | 0.748933 | 0.8165 |
| Spark | 0.999996 | 0.998 |

### Citation
### 🔥 Citation

:telescope: If you use our logparser tools or benchmarking results in your publication, please kindly cite the following papers.
If you use the code or benchmarking results in your publication, please kindly cite the following papers.

+ [**PKDD'20**] Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, Odej Kao. [Self-Supervised Log Parsing](https://arxiv.org/abs/2003.07905), *Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD)*, 2020.
+ [**ICSE'19**] Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu. [Tools and Benchmarks for Automated Log Parsing](https://arxiv.org/pdf/1811.03509.pdf). *International Conference on Software Engineering (ICSE)*, 2019.
+ [**DSN'16**] Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. [An Evaluation Study on Log Parsing and Its Use in Log Mining](https://jiemingzhu.github.io/pub/pjhe_dsn2016.pdf). *IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)*, 2016.
2 changes: 1 addition & 1 deletion logparser/NuLog/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
pillow==10.0.1
pillow==6.1.0
pandas
regex==2022.3.2
numpy
Expand Down
1 change: 0 additions & 1 deletion logparser/Spell/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,4 +57,3 @@ Running the benchmark script on Loghub_2k datasets, you could obtain the followi

+ [**ICSE'19**] Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu. [Tools and Benchmarks for Automated Log Parsing](https://arxiv.org/pdf/1811.03509.pdf). *International Conference on Software Engineering (ICSE)*, 2019.
+ [**DSN'16**] Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. [An Evaluation Study on Log Parsing and Its Use in Log Mining](https://jiemingzhu.github.io/pub/pjhe_dsn2016.pdf). *IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)*, 2016.

10 changes: 3 additions & 7 deletions logparser/ULP/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,6 @@

ULP (Universal Log Parsing) is a highly accurate log parsing tool, the ability to extract templates from unstructured log data. ULP learns from sample log data to recognize future log events. It combines pattern matching and frequency analysis techniques. First, log events are organized into groups using a text processing method. Frequency analysis is then applied locally to instances of the same group to identify static and dynamic content of log events. When applied to 10 log datasets of the Loghub benchmark, ULP achieves an average accuracy of 89.2%, which outperforms the accuracy of four leading log parsing tools, namely Drain, Logram, Spell and AEL. Additionally, ULP can parse up to four million log events in less than 3 minutes. ULP can be readily used by practitioners and researchers to parse effectively and efficiently large log files so as to support log analysis tasks.

Read more information about Drain from the following paper:

+ Issam Sedki, Abdelwahab Hamou-Lhadj, Otmane Ait-Mohamed, Mohammed A. Shehab. [An Effective Approach for Parsing Large Log Files](https://users.encs.concordia.ca/~abdelw/papers/ICSME2022_ULP.pdf), *Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME)*, 2022.

### Running

The code has been tested in the following enviornment:
Expand Down Expand Up @@ -51,9 +47,9 @@ Running the benchmark script on Loghub_2k datasets, you could obtain the followi
| Mac | 0.981294 | 0.814 |


### Citation
### 🔥 Citation

:telescope: If you use our logparser tools or benchmarking results in your publication, please kindly cite the following papers.
If you use the code or benchmarking results in your publication, please kindly cite the following papers.

+ [**ICSME'22**] Issam Sedki, Abdelwahab Hamou-Lhadj, Otmane Ait-Mohamed, Mohammed A. Shehab. [An Effective Approach for Parsing Large Log Files](https://users.encs.concordia.ca/~abdelw/papers/ICSME2022_ULP.pdf), *Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME)*, 2022.
+ [**ICSE'19**] Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu. [Tools and Benchmarks for Automated Log Parsing](https://arxiv.org/pdf/1811.03509.pdf). *International Conference on Software Engineering (ICSE)*, 2019.
+ [**DSN'16**] Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. [An Evaluation Study on Log Parsing and Its Use in Log Mining](https://jiemingzhu.github.io/pub/pjhe_dsn2016.pdf). *IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)*, 2016.
4 changes: 2 additions & 2 deletions logparser/logmatch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ Run the following scripts to start the demo:
python demo.py
```

### Citation
### 🔥 Citation

:telescope: If you use our logparser tools or benchmarking results in your publication, please kindly cite the following papers.
If you use the code or benchmarking results in your publication, please kindly cite the following papers.

+ [**ICSE'19**] Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu. [Tools and Benchmarks for Automated Log Parsing](https://arxiv.org/pdf/1811.03509.pdf). *International Conference on Software Engineering (ICSE)*, 2019.
+ [**DSN'16**] Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. [An Evaluation Study on Log Parsing and Its Use in Log Mining](https://jiemingzhu.github.io/pub/pjhe_dsn2016.pdf). *IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)*, 2016.

0 comments on commit 89c2832

Please sign in to comment.