Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would like to free Exp_pool earlier. #1413

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

linas
Copy link
Member

@linas linas commented Feb 5, 2023

It can get very large during MST parsing, and just takes up RAM
that extract-linkages could use. So I'd like to free it, but I
cannot, because parsing with null requires valid data, there.

It can get very large during MST parsing, and just takes up RAM
that extract-linkages could use.   So I'd like to free it, but I
cannot, because parsing with null requires valid data, there.
@linas
Copy link
Member Author

linas commented Feb 5, 2023

@ampli any comments or ideas w.r.t this pull req?

@ampli
Copy link
Member

ampli commented Feb 5, 2023

In generation mode we don't use parsing with nulls.
The same for batch mode. However, batch mode is now an option of link-parser and not the library. Making it again an option of the library will allow this memory-pool freeing and will also make batch processing faster because you can stop handling a sentence after you get the first valid linkage.

A more drastic change would be to disallow 2-step parsing altogether (but in such a case I would leave it as a "test"). In that case we know there will not be a parse with nulls after a try to parse with no nulls.

@linas
Copy link
Member Author

linas commented Feb 5, 2023

Yeah. I can add flag.

Disallowing two-step parsing is not that "drastic" ... the toekenization and dictionary lookup happens pretty quick. We could just repeat that, for each null count.

I don't know how to think about the trade-off between the "elegance" of not having to do tokenization several times, vs. the "efficiency" of discarding large chunks of RAM earlier. It's a trade-off. I'm actually surprised that the code was smart enough to know that it did not need to do the dict-lookup again. Link-grammar code is complicated.

@ampli
Copy link
Member

ampli commented Feb 5, 2023

Disallowing two-step parsing is not that "drastic" ... the toekenization and dictionary lookup happens pretty quick. We could just repeat that, for each null count.

By "drastic" I ment to disallow it all together, not to actually support it by doing tokenization/expression pruning again.

The traditional way to do the parsing, and what link-parser uses by default just now, is named (by me) in the code comments as 2-step parsing. I guess that most users that wrote their own code in C or any bonded language also use this way of parsing.

So not allowing 2-step parsing would invalidate most user-written parsers. On the other hand, re-tokenizing and doing expression-pruning again would currently waste several percent of CPU (and more if non-contiguous-morphology tokenizing is implemented, and moreover a better expression-pruning is implemented (that naturally takes more CPU but it is said to save more CPU in the power pruning). So I don't recommend at all forcing tokenization for additional sentence parses.

But I just now got an idea: We can add a parse option to indicate we are not going to do an additional parse of this sentence, and then the memory pools can be safely released. (Optionally we can add support that if the user issues a parse with nulls anyway, then tokenization (and expression pruning) will be done again, as you wrote above. But I don't think this is needed if the user has explicitly set this parse option so an additional parse is not expected.)

@ampli
Copy link
Member

ampli commented Feb 5, 2023

the "efficiency" of discarding large chunks of RAM earlier.

How much RAM, compared to Table_tracon (and table_lrcnt including its additional pools) and also the parse-set memory, does it save? Does the library then works faster, or you can then do parses that currently you don't have enough memory to do?

I asked this to understand why it might be important to release this memory earlier. (I guess that not releasing memory (and just reusing it) would be the faster option.)

@ampli
Copy link
Member

ampli commented Feb 6, 2023

that extract-linkages could use

Note that there is another way to save a lot of RAM:
After the parse, you can compact the Table_tracon table to include only non-zero entries, and then free the original one. I once test it, with the thought a smaller Table_tracon may fit better the CPU cache memory and thus speed up mk_parse_set(). But there was no saving (including generation). In addition, we know that freeing and re-allocating this table may cost us about 20% CPU (on Linux). For English, this table has relatively very few non-zero count entries, but I don't know if this is so with MST too.

To trace the number of non-zero entries, configure with -DDEBUG_TABLE_STAT.

@linas
Copy link
Member Author

linas commented Feb 7, 2023

How much RAM

The size of Exp_pool is discussed in #1402 (comment) and I reproduce the graph, below.

djm-dj-exp

For English, the number of Elts stays under 100K more or less. For MST, it gets up to about 3M . For now, I can live with that, it's not urgent. Also, there are other, non-link-grammar dependencies, that alter the size the MST dictionary. It might get smaller in the future, or it might get bigger. I'm working with a beta-test dataset, right now.

How much RAM compared to Table_tracon

It is about the same size, maybe slightly larger than Table_tracon. I have no idea how big ctxt->table_lrcnt[0].sz is, but I do monitor the size of wordvec_pool. For English, the wordvec_pool is about the same size as tracon. For MST, it is smaller(!!) probably because the MST sentences are (currently) short.

I asked this to understand why it might be important to release this memory earlier. I guess that not releasing memory (and just reusing it) would be the faster option.

I guess this might be an OK idea. The only tables that get outrageously large are the mlc and mld_pool; and the Parse_choice. But, after the recent changes, these are now "under control" and behaving acceptably. When running, I see RAM use pop up to 30 GB for half a minute or more, then drop back down to under 20, and bounce up and down like that. This is due to the pools being malloced and freed. And I'm OK with this. I guess. Right now, it's not a problem.

you can compact the Table_tracon

The size of pool_size(sent->Table_tracon_pool) is acceptable. It is shown in #1402 (comment) Trying to squeeze it does not seem to be important.

@linas linas marked this pull request as draft February 7, 2023 01:43
@ampli
Copy link
Member

ampli commented Feb 7, 2023

ctxt->table_lrcnt[0].sz is the number of LHS tracons (ctxt->table_lrcnt[0].sz RHS).
The upper bound of pool_size(sent->Table_tracon_pool) is
sentence_length * (ctxt->table_lrcnt[0].sz + ctxt>table_lrcnt[1].sz)
since for each tracon at most abs(c->farthest_word - c->nearest_word) elements are allocated (and for many, I think for tracons that are only shallow, no elements are allocated).

@ampli
Copy link
Member

ampli commented Feb 7, 2023

the mlc and mld_pool

In principle, it is possible to add a test argument to disable them. They speed up the failure batch by ~20%, but I don't have any idea how much, if anything, they contribute to the speed of MST parses of short sentences.

I have an active branch in which I added #define's to control their use (in addition to controlling the use of Table_tracon, table_lrcnt, and the word skip vector). In that branch, I'm adding a tighter "alternative mismatch" rejection while counting, to reduce the number of unsane linkages in the "amy" language (and in general). I used them for debugging, to help to find bugs in the added "alternative mismatch" implementation ( I found a bug by disabling using the Table_tracon). These #define's seem useful anyway, and they self-document the sections of code that implement the various caches, so I intend to include them in a PR. If the mlc and mld_pool pools may go very big, it will be then easy to make a run without them or to add a test argument to disable them.

@linas
Copy link
Member Author

linas commented Feb 7, 2023

I graphed pool_num_elements_issued(ctxt->mchxt->mlc_pool) in two different ways:

djm-dj-mlc

djm-exp-mlc

As you can see in either graph, it can grow up to 30 million entries. So it remains under a gigabyte, which is acceptable. For me, anything under 10 gigabytes per thread is acceptable. After that the troubles compound.

easy to make a run

I wish. All my runs are fiendishly complex.

#define

If the MST runs become smaller or faster without these pools, then having the #defines in the dict is where I'd want them.

Off-topic: for me, having some of the Parse_Opts be #defines in the dict would solve some of my problems. This includes !limit and !timeout ... and maybe also islands_ok and repeatable_rand and the recently added disjunct_limit and perform_pp_prune and all_short and maybe even also min/mix null count. Right now, these are hard-coded in my C++ code, and having them configurable in the dicts would be a great way of avoiding some of the balancing act issues I have.

@ampli
Copy link
Member

ampli commented Feb 7, 2023

Off-topic: for me, having some of the Parse_Opts be #defines in the dict would solve some of my problems. This includes !limit and !timeout ... and maybe also islands_ok and repeatable_rand and the recently added disjunct_limit and perform_pp_prune and all_short and maybe even also min/mix null count. Right now, these are hard-coded in my C++ code, and having them configurable in the dicts would be a great way of avoiding some of the balancing act issues I have.

It seems easy to implement. I can do that.

if (IS_GENERATION(sent->dict))
create_wildcard_word_disjunct_list(sent, opts);

build_sentence_disjuncts(sent, opts->disjunct_cost, opts);


#ifdef NEED_THESE_FOR_NULL_AND_PANIC_PARSING
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a parse option to indicate that a second pass is not expected. In that case, we can not only free the Exp_pool but immediately delete all the pools that are now pool-reused.

If a second pass happens anyway (disregarding the value of the said parse option) and there are no expressions, we can just return an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can only be a second pass if either

 null         Allow null links                                  1 (On)
 panic        Use of "panic mode"                               1 (On)

If both are off, then there's no second pass, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But this is a link-parser convention only. The library doesn't know about it. For example, if the library gets a sentence_parse() request with both min_null_count==0 and max_null_count==0, it doesn't know whether a subsequent call with, e.g., min_null_count==1 and max_null_count=sent_length will follow. So maybe we can add a free_memory or even a batch_mode parse option.

Copy link
Member Author

@linas linas Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. I forgot what this was about. Perhaps a new struct Parse_Options_s of bool single_pass ?

@ampli
Copy link
Member

ampli commented Jun 4, 2024

I have not yet commented on some things you mentioned:

Regarding the possible earlier release of mlc_pool, mk_parse_set() uses it, so its release would free working memory at the expense of significantly more CPU time. However, this may be acceptable since the CPU time of mk_parse_set() vs. the total CPU is small. The savings here could be 20Mx16B ~ 300MB. BTW, I'm now working on a change that, as a side effect, is said to drastically reduce the size of mlc_pool.

Regarding mld_pool, it got removed in PR #1460.

But I didn't understand the possible reason for earlier freeing the Exp_Pool. Since later stages don't use its memory, it would just consume swap space (vs. working RAM), which should not cause a big issue. Its removal from the working memory could be faster if MADV_PAGEOUT was used. To that end, we can use memory-pool block sizes of exact multiples of PAGESIZE to make that more efficient.

@linas
Copy link
Member Author

linas commented Jun 4, 2024

I want to keep this particular pull req open, but otherwise put it into "cold storage". Details of how I generate data are changing, How I use that data are changing, and so details of RAM and CPU usage will change as well.

Frankly, my head is all clogged up with other 1001 other issues right now, and while it might be nice to think the above problem just "went away", I can't really revisit it, right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants