Invalid value for 'content': expected a string, got null. #1421

flomrz · 2024-11-18T16:41:05Z

Do you need to file an issue?

I have searched the existing issues and this bug is not already filed.
My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the issue

Describe the bug
When I try to index txt files i get this error after indexing Several files. After a few Hours of indexing i get this error : openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid value for 'content': expected a string, got null.", 'type': 'invalid_request_error', 'param': 'messages.[1].content', 'code': None}}
To Reproduce
I use gpt4o-mini and deployed the Accelerator as told
My indexing files are not empty and are in UTF-8

Steps to reproduce

I just used the accelerator and tried to index txt files

GraphRAG Config Used

# Paste your config here

Logs and screenshots

{
'type': 'on_workflow_start',
'data': 'Index: gut_index_pdf -- Workflow (1/16): create_base_text_units started.',
'details': {
'workflow_name': 'create_base_text_units',
'index_name': 'gut_index_pdf',
},
}
{
'type': 'on_workflow_end',
'data': 'Index: gut_index_pdf -- Workflow (1/16): create_base_text_units complete.',
'details': {
'workflow_name': 'create_base_text_units',
'index_name': 'gut_index_pdf',
},
}
{
'type': 'on_workflow_start',
'data': 'Index: gut_index_pdf -- Workflow (2/16): create_base_extracted_entities started.',
'details': {
'workflow_name': 'create_base_extracted_entities',
'index_name': 'gut_index_pdf',
},
}
{
'type': 'error',
'data': 'Error Invoking LLM',
'cause': (
'Error code: 400 - {'error': {'message': "Invalid value for 'content': expected a string, got null.", 'type''
": 'invalid_request_error', 'param': 'messages.[1].content', 'code': None}}"
),
'stack': (
'Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/base/base_llm.py", line 53, in _invoke\n'
' output = await self._execute_llm(input, **kwargs)\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/openai/openai_chat_llm.py", line 53, in _execu'
'te_llm\n'
' completion = await self.client.chat.completions.create(\n'
' File "/usr/local/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 1633, in create'
'\n'
' return await self._post(\n'
' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1838, in post\n'
' return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)\n'
' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1532, in request\n'
' return await self._request(\n'
' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1633, in _request\n'
' raise self._make_status_error_from_response(err.response) from None\n'
'openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid value for 'content': expected a st'
'ring, got null.", 'type': 'invalid_request_error', 'param': 'messages.[1].content', 'code': None}}\n'
),
'details': {
'input': (
'MANY entities and relationships were missed in the last extraction. Remember to ONLY emit entities that'
' match any of the previously extracted types. Add them below using the same format:\n'
),
},
}
{
'type': 'error',
'data': 'Entity Extraction Error',
'cause': (
'Error code: 400 - {'error': {'message': "Invalid value for 'content': expected a string, got null.", 'type''
": 'invalid_request_error', 'param': 'messages.[1].content', 'code': None}}"
),
'stack': (
'Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/index/graph/extractors/graph/graph_extractor.py", '
'line 122, in call\n'
' result = await self._process_document(text, prompt_variables)\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/index/graph/extractors/graph/graph_extractor.py", '
'line 161, in _process_document\n'
' response = await self._llm(\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/openai/json_parsing_llm.py", line 34, in cal'
'l\n'
' result = await self._delegate(input, **kwargs)\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/openai/openai_token_replacing_llm.py", line 37'
', in call\n'
' return await self._delegate(input, **kwargs)\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/openai/openai_history_tracking_llm.py", line 3'
'3, in call\n'
' output = await self._delegate(input, **kwargs)\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/base/caching_llm.py", line 96, in call\n'
' result = await self._delegate(input, **kwargs)\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 177, in cal'
'l\n'
' result, start = await execute_with_retry()\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 159, in execu'
'te_with_retry\n'
' async for attempt in retryer:\n'
' File "/usr/local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 166, in anext\n'
' do = await self.iter(retry_state=self._retry_state)\n'
' File "/usr/local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 153, in iter\n'
' result = await action(retry_state)\n'
' File "/usr/local/lib/python3.10/site-packages/tenacity/_utils.py", line 99, in inner\n'
' return call(*args, **kwargs)\n'
' File "/usr/local/lib/python3.10/site-packages/tenacity/init.py", line 398, in \n'
' self._add_action_func(lambda rs: rs.outcome.result())\n'
' File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result\n'
' return self.__get_result()\n'
' File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result\n'
' raise self._exception\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 165, in execu'
'te_with_retry\n'
' return await do_attempt(), start\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/base/rate_limiting_llm.py", line 147, in do_at'
'tempt\n'
' return await self._delegate(input, **kwargs)\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/base/base_llm.py", line 49, in call\n'
' return await self._invoke(input, **kwargs)\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/base/base_llm.py", line 53, in _invoke\n'
' output = await self._execute_llm(input, **kwargs)\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/openai/openai_chat_llm.py", line 53, in _execu'
'te_llm\n'
' completion = await self.client.chat.completions.create(\n'
' File "/usr/local/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 1633, in create'
'\n'
' return await self._post(\n'
' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1838, in post\n'
' return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)\n'
' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1532, in request\n'
' return await self._request(\n'
' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1633, in _request\n'
' raise self._make_status_error_from_response(err.response) from None\n'
'openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid value for 'content': expected a st'
'ring, got null.", 'type': 'invalid_request_error', 'param': 'messages.[1].content', 'code': None}}\n'
),
'details': {
'doc_index': 0,
'text': (
'2 h p.i., and phagocytosis/intracellular killing by \r\n'
'kidney-infiltrating neutrophils were measured by FACS. The streptavidin-AF633+dTomato+ \r\n'
'C. albicans emit two fluorescence signals: dTomato+AF633+ is emitted by live C. albicans, \r\n'
'while dead C. albicans emits only AF633 fluorescence signal (dTomato−AF633+).\r\n'
'In vitro C. albicans killing assay— BM neutrophils (5 × 104/well) were seeded in a \r\n'
'flat bottom 96-well plate. After 30 min, 104 CFU of C. albicans were added to the wells. \r\n'
'The number of live C. albicans was assessed by plating serial dilutions on YPD agar at \r\n'
'3 h post-incubation. The percent killing was expressed as [1 − (CFU of C. albicans in \r\n'
'the presence of neutrophils/CFU of C. albicans cells in the absence of neutrophils)] × Li et al. Page '
'14\r\n'
'Cell Host Microbe . Author manuscript; available in PMC 2023 April 13.\r\n'
'Author Manuscript Author Manuscript Author Manuscript Author Manuscript\r\n'
'100%. Fold change C. albicans killing was expressed as percent killing in experimental \r\n'
'group/percent killing in control group.\r\n'
'NETosis assay— BM neutrophils were incubated with C. albicans for 3 h. The cells were \r\n'
'fixed (4% PFA) and permeabilized and NET formation was measured by staining with Sytox \r\n'
'Orange and an anti-neutrophil elastase polyclonal antibody. The secondary antibody used \r\n'
'was goat anti-rabbit Cy5. The staining was visualized using an EVOS FL Auto microscope \r\n'
'(Life Technologies).\r\n'
'Giemsa Staining of blood smear— Peripheral blood smears of PMNWT and \r\n'
'PMNΔGLUT1 mice were stained with Giemsa stain and microscopically evaluated for \r\n'
'the morphology of neutrophils. The staining was visualized using an EVOS FL Auto \r\n'
'microscope.\r\n'
'Transwell migration assay— BM neutrophils were subjected to transwell migration \r\n'
'assay (inserts with 3 μm pore size) in the presence or absence of recombinant murine \r\n'
'CXCL2 (100 ng/ml). Ninety min post-incubation, the numbers of cells in the lower and \r\n'
'upper chambers were quantified by flow cytometry analysis and expressed as Chemotactic \r\n'
'Index (Number of cells in the lower chamber/Number of cells in the upper chamber).\r\n'
'Granulopoiesis measurement— BM cells from femur and tibia were stained with anti-\r\n'
'Ly6G, anti-CD34, anti-CD3, anti-CD19, anti-B220 and anti-C-kit antibodies and analyzed \r\n'
'by flow cytometry analysis.\r\n'
'Apoptosis detection— Mice were infected with 105 CFU of C. albicans via the lateral \r\n'
'tail vein. After 24 hours, kidney cell suspensions were prepared and early and late \r\n'
'apoptotic neutrophils were detected using Annexin V Apoptosis Detection Kit, according \r\n'
'to manufacturer instructions, followed by flow cytometry analysis.\r\n'
'ROS measurement— Mice were infected with 105 CFU of C. albicans via the lateral \r\n'
'tail vein. After 24 hours, kidney cell suspensions were prepared and resuspended with \r\n'
'RPMI complete medium. CellROX® Deep Red reagent was added to the cell suspensions \r\n'
'and incubated at 37°C for 30 min. The ROS production was measured by flow cytometry \r\n'
'analysis.\r\n'
'Western blot analysis— Neutrophils were lysed by 1×NP-40 lysis buffer supplemented \r\n'
'with protease inhibitor cocktail. Lysates were separated by SDS-PAGE and transferred \r\n'
'to polyvinylidene difluoride membranes. After incubation with primary and secondary \r\n'
'antibodies, protein bands were detected using an enhanced chemiluminescence detection \r\n'
'system and developed with a FluorChem E imager (ProteinSimple).\r\n'
'RNA extraction and qPCR— RNA was extracted from tissues or neutrophils using \r\n'
'RNeasy kits. Complementary DNA was synthesized by SuperScript III First Strand Kits. \r\n'
'Quantitative real-time PCR (qPCR) was performed with the PerfeCTa SYBR Green FastMix \r\n'
'and analyzed on an ABI 7300 real-time instrument. Primers were obtained from QuantiTect \r\n'
'Primer Assays. The expression of each gene was normalized to that of Gapdh .Li et al. Page 15\r\n'
'Cell Host Microbe . Author manuscript; available in PMC 2023 April 13.\r\n'
'Author Manuscript Author Manuscript Author Manuscript Author Manuscript\r\n'
'Seahorse metabolic assay— BM neutrophils from PMNWT and PMNΔGLUT1 mice were \r\n'
'isolated and plated on Cell-Tak coated Seahorse culture plates (5 × 105 cells/well) in DMEM \r\n'
'without glucose. After stimulated with/without curdlan (Cur) (10 μg/ml) at 37°C for 3 h, the \r\n'
'cells were analyzed using a Seahorse XFe96 Analyzer (Agilent). EACR (Basal extracellular \r\n'
'acidification rate) was detected in the presence of glucose (25mM), oligomycin (2 mM), \r\n'
'2-deoxyglucose (100 mM) to obtain maximal and control EACR values.\r\n'
'Untargeted high-resolution LC-HRMS— Neutrophils from PMNWT and PMN△GLUT1 \r\n'
'mice were either stimulated with curdlan (10 μg/ml) for 3 h (8 replications per'
),
},
}

additional info
as i said im using gpt4o-mini and the "GRAPHRAG_API_VERSION": "2024-08-01-preview"

all txt files have content in them. I've used auto generated prompts.

Additional Information

I index my files and it first seems to wokr, it takes some time on step 2 but i can see in the metriks of my AOAI that requests are being made etc. i can also see files in the entity_extraction chache. After a while the requests top and i geht this error and just stops, the job doesnt restart or anything it just stands still at step 2

as i said im using gpt4o-mini and the "GRAPHRAG_API_VERSION": "2024-08-01-preview"

all txt files have content in them. I've used auto generated prompts. Those are also not empty

flomrz added the triage Default label assignment, indicates new issue needs reviewed by a maintainer label Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalid value for 'content': expected a string, got null. #1421

Invalid value for 'content': expected a string, got null. #1421

flomrz commented Nov 18, 2024

Invalid value for 'content': expected a string, got null. #1421

Invalid value for 'content': expected a string, got null. #1421

Comments

flomrz commented Nov 18, 2024

Do you need to file an issue?

Describe the issue

Steps to reproduce

GraphRAG Config Used

Logs and screenshots

Additional Information