Replies: 1 comment
-
Hi, I'm actually not a regular user of OpenRefine but looking at the code it looks like this is the logic for indicating an exact match. I.e. it counts as an exact match if either there is only one match or there is only one match which matches letter for letter. It's possible that the java version used a different heuristic for what counts as a sure match. I see the Automatch with high confidence check, but I'm not clear if this is a client side feature or a server side feature. The spec has changed a bit since I was last actively developing this but a quick browse doesn't mention "confidence" at all and all the matches of "auto" are about the suggestion feature. When I was using OpenRefine, I remember you could filter on the reconciliation score. I think this can be used to accept broad swaths of results. I'm happy to poke around a bit but would need copies of the files you're using to try to reproduce. You might also want to ask about Automatch on the forum. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I really hope I can get some help.
My commands, with file names / column IDs:
$ csv-reconcile init WPProducts.csv ID Name
Whereby I want to fuzzy match 'Title' in the open refine project, against the 'Name' column in the WPProducts.csv
Here's my openrefine operations:
[ { "op": "core/recon", "engineConfig": { "facets": [], "mode": "row-based" }, "columnName": "Title", "config": { "mode": "standard-service", "service": "http://127.0.0.1:5000/reconcile", "identifierSpace": "http://localhost/csv_reconcile/ids", "schemaSpace": "http://localhost/csv_reconcile/schema", "autoMatch": true, "columnDetails": [], "limit": 0 }, "description": "Reconcile cells in column Title to type null" } ]
Screenshot of the outcome is here:
The second item scored 100% which from my past use of reconcile-csv (java version), would definitely automatch.
And the summary here:
Based on this, most matches would automatch as
Is there something I'm missing in the settings?
Also, if I create an adjacent column, 'cell.recon.match.id' should it dynamically update with the matched ID as i progress through the manual confirmation process?
Thanks for your help! I really hope I can sort this out pronto!
Beta Was this translation helpful? Give feedback.
All reactions