-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The time-consuming problem of converting csv data to RDF #59
Comments
@eiglesias34 We request team to help us to see the above performance problem, 1 [Problem domain] Our AIOps team to build our infra operational KG using SDM-RDFizer Thanks! |
Dear @tangyong Many thanks for sharing this use case. We have implemented new optimization techniques to speed up the execution of the joins in the mappings. Please, let us arrange a meeting, and we can share with you the new version which is still in development stage. Please, contact me at [email protected] Best regards, Maria-Esther Vidal |
thanks @mevs very much! I will arrange a meeting and contact with you. |
Dear @mevs , I have discussed with my team that we wish to firstly obtain the new optimaized version for comparing performance improvment and feedback you again. I will send my quest to your email. Thanks! |
Dear @mevs @dachafra @eiglesias34 , We have made a dataset for reproducing the problem and we wish to send you for assisting in investigation/fix. If you have time to help us , please telling me how to share the dataset (~800M) and we will upload the dataset into shared storage. Thanks! |
Problem Description:
With 8 csv files, it took more than a day to convert about 600M data into RDF. We also tested the conversion of two csv files to RDF separately, which took more than a few hours.
Data source:
The data comes from CMDB, a total of 8 csv files, including host (18M), vm (18M), software (160M) and other data, there is a one-to-many and many-to-many semantic relationship between these data.
Config.ini and mapping.ttl Configuration:
Execute:
environment:
os: centos7
cpu core:64
memory: 96G
The text was updated successfully, but these errors were encountered: