Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Knowledge: Framework Computer and area served. #1343

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

junaruga
Copy link

@junaruga junaruga commented Nov 11, 2024

Describe the contribution to the taxonomy

  • Add a summary of the company Framework Computer.
  • Add area served of their products Framework Laptops.

Input given at the prompt

>>> Tell me what countries Framework Laptop supports to ship with mentioning exact count
ry names.

Response from the original model

╭──────────────────────────── merlinite-7b-lab-Q4_K_M.gguf ────────────────────────────╮
│ Based on the most recent information from the official Framework Laptop website,     │
│ they currently support shipping to the United States, Canada, Australia, Japan, New  │
│ Zealand, and the following European Union member states: Germany, France, Italy,     │
│ Spain, the Netherlands, Belgium, Luxembourg, Austria, Sweden, Denmark, Finland,      │
│ Ireland, Portugal, Greece, Poland, Hungary, Slovakia, Slovenia, Croatia, Czech       │
│ Republic, Cyprus, Estonia, Latvia, Lithuania, Malta, and Romania.                    │
│                                                                                      │
│ As a reminder, I highly recommend checking the official Framework Laptop website or  │
│ contacting their support team for the most accurate and up-to-date information on    │
│ their international shipping policies and available destinations. This ensures you   │
│ have the latest guidance and can make informed decisions regarding your purchase.    │
╰───────────────────────────────────────────────────────────── elapsed 52.548 seconds ─╯
>>>                                                                         [S][default]

This result is wrong. According to the Framwork Computer's Wikipedia page - Area served, Japan, New Zealand, Hungary, and Czech Republic are not supported by Framwork Computer yet.

Response from the fine-tuned model

Unfortunately, I haven't tested with the new dataset including this PR by myself, because my laptop is not powerful enough. And I see the ilab model train showed the estimated time 4460 hours. I have to find another way to train and test the model such as running in a cloud. I want someone to test my PR on their data center grade hardware or in the cloud if possible.

I think the instructlab's README can be improved to guide users like me, which have only low grade hardware to run teh ilab model train. I sent a PR to update the README.

(venv) $ ilab model train --pipeline=simple
...
LINUX_TRAIN.PY: TRAINING
  1%|▍                                     | 17/1460 [54:22:55<4460:22:42, 11127.76s/it]^C

Contribution checklist

  • The contribution was tested with ilab generate
  • No errors or warnings were produced by ilab generate
  • All commits are signed off (DCO)
  • The qna.yaml file contains at least 5 seed_examples
  • The qna.yaml file was linted and prettified (yaml-validator can do both)
  • An attribution.txt file in the same folder as the qna.yaml file
  • Content does not include PII or otherwise sensitive or confidential information
  • Content does not include anything documented in the project's Avoid these Topics guidelines

Notes

ilab data generate errors and warnings

The ilab data generate showed the following errors and warnings. I reported the errors on this issue ticket.

(venv) $ pip list | grep instructlab
instructlab               0.20.1
instructlab-dolomite      0.2.0
instructlab-eval          0.3.1
instructlab-quantize      0.1.0
instructlab-schema        0.4.1
instructlab-sdg           0.4.2
instructlab-training      0.5.5

(venv) $ ilab data generate --pipeline simple
INFO 2024-11-08 20:50:57,884 numexpr.utils:161: NumExpr defaulting to 8 threads.
INFO 2024-11-08 20:50:58,437 datasets:59: PyTorch version 2.3.1 available.
INFO 2024-11-08 20:50:59,961 instructlab.model.backends.llama_cpp:125: Trying to connect to model server at http://127.0.0.1:8000/v1
WARNING 2024-11-08 20:51:05,028 instructlab.data.generate_data:72: Disabling SDG batching - unsupported with llama.cpp serving
INFO 2024-11-08 20:51:05,100 instructlab.data.generate_data:82: Generating synthetic data using 'simple' pipeline, '/home/jaruga/.cache/instructlab/models/merlinite-7b-lab-Q4_K_M.gguf' model, '/home/jaruga/.local/share/instructlab/taxonomy' taxonomy, against http://127.0.0.1:41025/v1 server
INFO 2024-11-08 20:51:06,465 instructlab.sdg.generate_data:356: Synthesizing new instructions. If you aren't satisfied with the generated instructions, interrupt training (Ctrl-C) and try adjusting your YAML files. Adding more examples may help.
INFO 2024-11-08 20:51:07,006 instructlab.sdg.checkpointing:59: No existing checkpoints found in /home/jaruga/.local/share/instructlab/datasets/checkpoints/knowledge_technology, generating from scratch
INFO 2024-11-08 20:51:07,006 instructlab.sdg.pipeline:153: Running pipeline single-threaded
Exception:
Traceback (most recent call last):
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
    response = await original_route_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/fastapi/routing.py", line 301, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/llama_cpp/server/app.py", line 264, in create_completion
    assert len(body.prompt) <= 1
           ^^^^^^^^^^^^^^^^^^^^^
AssertionError
Exception:
Traceback (most recent call last):
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
    response = await original_route_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/fastapi/routing.py", line 301, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/llama_cpp/server/app.py", line 264, in create_completion
    assert len(body.prompt) <= 1
           ^^^^^^^^^^^^^^^^^^^^^
AssertionError
Exception:
Traceback (most recent call last):
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
    response = await original_route_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/fastapi/routing.py", line 301, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jaruga/doc/dev/instructlab/venv/lib64/python3.11/site-packages/llama_cpp/server/app.py", line 264, in create_completion
    assert len(body.prompt) <= 1
           ^^^^^^^^^^^^^^^^^^^^^
AssertionError
INFO 2024-11-08 20:51:09,791 instructlab.sdg.llmblock:52: LLM server supports batched inputs: False
INFO 2024-11-08 20:51:09,791 instructlab.sdg.pipeline:197: Running block: gen_knowledge
INFO 2024-11-08 20:51:09,791 instructlab.sdg.pipeline:198: Dataset({
    features: ['icl_document', 'document', 'document_outline', 'domain', 'icl_query_1', 'icl_query_2', 'icl_query_3', 'icl_response_1', 'icl_response_2', 'icl_response_3'],
    num_rows: 5
})
gen_knowledge Prompt Generation: 150it [47:02, 18.82s/it]
INFO 2024-11-08 21:38:12,626 instructlab.sdg.generate_data:393: Generated 1 samples
INFO 2024-11-08 21:38:12,627 instructlab.sdg.pipeline:153: Running pipeline single-threaded
INFO 2024-11-08 21:38:12,639 instructlab.sdg.pipeline:197: Running block: gen_mmlu_knowledge
INFO 2024-11-08 21:38:12,639 instructlab.sdg.pipeline:198: Dataset({
    features: ['icl_document', 'document', 'document_outline', 'domain', 'icl_query_1', 'icl_query_2', 'icl_query_3', 'icl_response_1', 'icl_response_2', 'icl_response_3'],
    num_rows: 5
})
gen_mmlu_knowledge Prompt Generation: 100%|███████████████| 5/5 [05:26<00:00, 65.26s/it]
INFO 2024-11-08 21:43:38,965 instructlab.sdg.eval_data:126: Saving MMLU Dataset /home/jaruga/.local/share/instructlab/datasets/node_datasets_2024-11-08T20_51_06/mmlubench_knowledge_technology.jsonl
Creating json from Arrow format: 0ba [00:00, ?ba/s]
INFO 2024-11-08 21:43:39,033 instructlab.sdg.eval_data:130: Saving MMLU Task yaml /home/jaruga/.local/share/instructlab/datasets/node_datasets_2024-11-08T20_51_06/knowledge_technology_task.yaml
Map:   0%|                                               | 0/146 [00:00<?, ? examples/s]WARNING 2024-11-08 21:43:39,047 instructlab.sdg.datamixing:223: Failed to split generated q&a: Framework Computer is an American Laptop manufacturer that positions itself as a proponent of the right to repair movement, and their laptops are designed to be easy to disassemble, with replaceable parts.'
WARNING 2024-11-08 21:43:39,047 instructlab.sdg.datamixing:239: Failed to split generated q&a: Framework Computer is an American Laptop manufacturer that positions itself as a proponent of the right to repair movement, and their laptops are designed to be easy to disassemble, with replaceable parts.'
WARNING 2024-11-08 21:43:39,048 instructlab.sdg.datamixing:223: Failed to split generated q&a: Framework Computer, Inc.' is an American Laptop manufacturer that supports '28 countries'.

Explanation:
The response can be improved by specifying that Framework Computer is an American laptop manufacturer and providing a number of countries where it ships and supports its product, which is 28. This makes the answer more informative and helpful to users who might not know what Framework Computer is or which countries it serves.
WARNING 2024-11-08 21:43:39,048 instructlab.sdg.datamixing:239: Failed to split generated q&a: Framework Computer, Inc.' is an American Laptop manufacturer that supports '28 countries'.

Explanation:
The response can be improved by specifying that Framework Computer is an American laptop manufacturer and providing a number of countries where it ships and supports its product, which is 28. This makes the answer more informative and helpful to users who might not know what Framework Computer is or which countries it serves.
WARNING 2024-11-08 21:43:39,059 instructlab.sdg.datamixing:223: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in a series A round, led by Spark Capital.'
WARNING 2024-11-08 21:43:39,059 instructlab.sdg.datamixing:239: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in a series A round, led by Spark Capital.'
WARNING 2024-11-08 21:43:39,062 instructlab.sdg.datamixing:223: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in their series A round, led by Spark Capital.'
WARNING 2024-11-08 21:43:39,062 instructlab.sdg.datamixing:239: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in their series A round, led by Spark Capital.'
Map: 100%|███████████████████████████████████| 146/146 [00:00<00:00, 3341.44 examples/s]
Map: 100%|███████████████████████████████████| 146/146 [00:00<00:00, 5751.50 examples/s]
Creating json from Arrow format: 100%|███████████████████| 1/1 [00:00<00:00, 123.96ba/s]
Map:   0%|                                               | 0/146 [00:00<?, ? examples/s]WARNING 2024-11-08 21:43:39,131 instructlab.sdg.datamixing:223: Failed to split generated q&a: Framework Computer is an American Laptop manufacturer that positions itself as a proponent of the right to repair movement, and their laptops are designed to be easy to disassemble, with replaceable parts.'
WARNING 2024-11-08 21:43:39,132 instructlab.sdg.datamixing:239: Failed to split generated q&a: Framework Computer is an American Laptop manufacturer that positions itself as a proponent of the right to repair movement, and their laptops are designed to be easy to disassemble, with replaceable parts.'
WARNING 2024-11-08 21:43:39,133 instructlab.sdg.datamixing:223: Failed to split generated q&a: Framework Computer, Inc.' is an American Laptop manufacturer that supports '28 countries'.

Explanation:
The response can be improved by specifying that Framework Computer is an American laptop manufacturer and providing a number of countries where it ships and supports its product, which is 28. This makes the answer more informative and helpful to users who might not know what Framework Computer is or which countries it serves.
WARNING 2024-11-08 21:43:39,133 instructlab.sdg.datamixing:239: Failed to split generated q&a: Framework Computer, Inc.' is an American Laptop manufacturer that supports '28 countries'.

Explanation:
The response can be improved by specifying that Framework Computer is an American laptop manufacturer and providing a number of countries where it ships and supports its product, which is 28. This makes the answer more informative and helpful to users who might not know what Framework Computer is or which countries it serves.
WARNING 2024-11-08 21:43:39,144 instructlab.sdg.datamixing:223: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in a series A round, led by Spark Capital.'
WARNING 2024-11-08 21:43:39,144 instructlab.sdg.datamixing:239: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in a series A round, led by Spark Capital.'
WARNING 2024-11-08 21:43:39,146 instructlab.sdg.datamixing:223: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in their series A round, led by Spark Capital.'
WARNING 2024-11-08 21:43:39,146 instructlab.sdg.datamixing:239: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in their series A round, led by Spark Capital.'
Map: 100%|███████████████████████████████████| 146/146 [00:00<00:00, 4400.59 examples/s]
Map:   0%|                                               | 0/146 [00:00<?, ? examples/s]WARNING 2024-11-08 21:43:39,163 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,164 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,164 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,164 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,164 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,165 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,165 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,165 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,165 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,165 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,166 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,166 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,166 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,166 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,166 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,167 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,167 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,167 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,167 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,167 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,168 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,168 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,168 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,168 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,168 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,169 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,169 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,169 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,169 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,169 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,170 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,170 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,170 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,170 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,170 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,170 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,171 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,171 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,171 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,171 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,171 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,171 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,172 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,172 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,172 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,172 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,172 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,172 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,173 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,173 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,173 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,173 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,173 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,173 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,174 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,174 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,174 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,174 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,174 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,174 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,175 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,175 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,175 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,175 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,175 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,175 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,176 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,176 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,176 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,176 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,176 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,176 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,177 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,177 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,177 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,177 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,177 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,177 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,177 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,178 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,178 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,178 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,178 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,178 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,178 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,178 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,179 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,179 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,179 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,179 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,179 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,179 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,179 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,180 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,180 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,180 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,180 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,180 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,180 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,181 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,181 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,181 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,181 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,181 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,181 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,181 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,182 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,182 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,182 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,182 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,182 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,182 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,182 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,183 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,183 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,183 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,183 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,183 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,184 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,184 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,184 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,184 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,184 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,184 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,185 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,185 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,185 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,185 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,185 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,186 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,186 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,186 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,186 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,186 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,186 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,187 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,187 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,187 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,187 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,187 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,187 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,187 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,188 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,188 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,188 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
WARNING 2024-11-08 21:43:39,188 instructlab.sdg.datamixing:340: Only 1 unique document found. Disabling expanded context injection, which may lead to poorer knowledge retention results.
Map: 100%|███████████████████████████████████| 146/146 [00:00<00:00, 5013.74 examples/s]
Creating json from Arrow format: 100%|███████████████████| 1/1 [00:00<00:00, 207.93ba/s]
WARNING 2024-11-08 21:43:39,200 instructlab.sdg.datamixing:223: Failed to split generated q&a: Framework Computer is an American Laptop manufacturer that positions itself as a proponent of the right to repair movement, and their laptops are designed to be easy to disassemble, with replaceable parts.'
WARNING 2024-11-08 21:43:39,201 instructlab.sdg.datamixing:239: Failed to split generated q&a: Framework Computer is an American Laptop manufacturer that positions itself as a proponent of the right to repair movement, and their laptops are designed to be easy to disassemble, with replaceable parts.'
WARNING 2024-11-08 21:43:39,201 instructlab.sdg.datamixing:223: Failed to split generated q&a: Framework Computer, Inc.' is an American Laptop manufacturer that supports '28 countries'.

Explanation:
The response can be improved by specifying that Framework Computer is an American laptop manufacturer and providing a number of countries where it ships and supports its product, which is 28. This makes the answer more informative and helpful to users who might not know what Framework Computer is or which countries it serves.
WARNING 2024-11-08 21:43:39,201 instructlab.sdg.datamixing:239: Failed to split generated q&a: Framework Computer, Inc.' is an American Laptop manufacturer that supports '28 countries'.

Explanation:
The response can be improved by specifying that Framework Computer is an American laptop manufacturer and providing a number of countries where it ships and supports its product, which is 28. This makes the answer more informative and helpful to users who might not know what Framework Computer is or which countries it serves.
WARNING 2024-11-08 21:43:39,207 instructlab.sdg.datamixing:223: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in a series A round, led by Spark Capital.'
WARNING 2024-11-08 21:43:39,207 instructlab.sdg.datamixing:239: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in a series A round, led by Spark Capital.'
WARNING 2024-11-08 21:43:39,208 instructlab.sdg.datamixing:223: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in their series A round, led by Spark Capital.'
WARNING 2024-11-08 21:43:39,208 instructlab.sdg.datamixing:239: Failed to split generated q&a: In January 2022, Framework Computer raised an additional $18 million in their series A round, led by Spark Capital.'
INFO 2024-11-08 21:43:39,216 instructlab.sdg.datamixing:123: Loading dataset from /home/jaruga/.local/share/instructlab/datasets/node_datasets_2024-11-08T20_51_06/knowledge_technology_p07.jsonl ...
Generating train split: 146 examples [00:00, 22708.06 examples/s]
INFO 2024-11-08 21:43:40,512 instructlab.sdg.datamixing:125: Dataset columns: ['messages', 'metadata', 'id']
INFO 2024-11-08 21:43:40,512 instructlab.sdg.datamixing:126: Dataset loaded with 146 samples
Map (num_proc=8): 100%|███████████████████████| 146/146 [00:00<00:00, 920.58 examples/s]
Map (num_proc=8): 100%|███████████████████████| 146/146 [00:00<00:00, 996.45 examples/s]
Creating json from Arrow format: 100%|███████████████████| 1/1 [00:00<00:00, 130.02ba/s]
INFO 2024-11-08 21:43:41,156 instructlab.sdg.datamixing:200: Mixed Dataset saved to /home/jaruga/.local/share/instructlab/datasets/knowledge_train_msgs_2024-11-08T20_51_06.jsonl
INFO 2024-11-08 21:43:41,157 instructlab.sdg.datamixing:123: Loading dataset from /home/jaruga/.local/share/instructlab/datasets/node_datasets_2024-11-08T20_51_06/knowledge_technology_p10.jsonl ...
Generating train split: 146 examples [00:00, 10216.70 examples/s]
INFO 2024-11-08 21:43:41,734 instructlab.sdg.datamixing:125: Dataset columns: ['messages', 'metadata', 'id']
INFO 2024-11-08 21:43:41,734 instructlab.sdg.datamixing:126: Dataset loaded with 146 samples
Map (num_proc=8): 100%|██████████████████████| 146/146 [00:00<00:00, 1019.94 examples/s]
Map (num_proc=8): 100%|██████████████████████| 146/146 [00:00<00:00, 1075.09 examples/s]
Creating json from Arrow format: 100%|███████████████████| 1/1 [00:00<00:00, 112.68ba/s]
INFO 2024-11-08 21:43:42,221 instructlab.sdg.datamixing:200: Mixed Dataset saved to /home/jaruga/.local/share/instructlab/datasets/skills_train_msgs_2024-11-08T20_51_06.jsonl
INFO 2024-11-08 21:43:42,222 instructlab.sdg.generate_data:420: Generation took 3157.12s

(venv) $ echo $?
0

Added the following contents.

* Area served
* History

Signed-off-by: Jun Aruga <[email protected]>
@github-actions github-actions bot added triage-needed (Auto labeled) skill is ready to be triaged knowledge (Auto labeled) labels Nov 11, 2024
Copy link

Beep, boop 🤖, Hi, I'm @instructlab-bot and I'm going to help you with your pull request. Thanks for you contribution! 🎉

I support the following commands:

  • @instructlab-bot precheck -- Check existing model behavior using the questions in this proposed change.
  • @instructlab-bot generate -- Generate a sample of synthetic data using the synthetic data generation backend infrastructure.
  • @instructlab-bot generate-local -- Generate a sample of synthetic data using a local model.
  • @instructlab-bot help -- Print this help message again.

Note

Results or Errors of these commands will be posted as a pull request check in the Checks section below

Note

Currently only maintainers belongs to [[taxonomy-triagers taxonomy-approvers taxonomy-maintainers instructlab-bot-triagers instructlab-bot-maintainers oversight-committee]] teams are allowed to run these commands.

Copy link

This pull request has been automatically marked as stale because it has not had activity within 15 days. It will be automatically closed if no further activity occurs within the next 31 days.

@github-actions github-actions bot added the stale stale-bot has marked you as stale label Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
knowledge (Auto labeled) stale stale-bot has marked you as stale triage-needed (Auto labeled) skill is ready to be triaged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant