-
Notifications
You must be signed in to change notification settings - Fork 48
Added a framework to be able to diff shitty/ios config #277
base: develop
Are you sure you want to change the base?
Conversation
You can ignore my comment, the double git confused me |
Just tested briefly, it looks nice. It will need some more ordering logic though, for example with access-lists. I sent in #278, which Travis seems happy with but it doesn't look like the tests are run. It fails locally for me due to ordering issues. It could possibly work sometimes but it will be random. Perhaps an ordered dict would do it. |
|
||
It is important this is up to date or the diff might fail to detect changes properly. | ||
""" | ||
EXACT_MATCHES = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have two questions:
- What happens if a command that can have multiple instances does not appear in this list?
- How long is this list going to be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It might fail to generate a proper diff.
- Only god and John Chamber know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am worried that item 2 might be huge and that might cause this diff to be unreliable and quite a bit of work to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, maintenance is easy, we just have to add the missing statements as we find them. Lots could actually be autodiscovered as the identifier is going to be an IP address or a number so you can regex the command and try to figure out if there is an identifier. For example, logging syslog 1.1.1.1
, ntp server 1.1.1.1
, ip address 1.1.1.1
. Not sure if it's worth adding the extra complexity/computation as commands are finite, at some point you will have a complete list.
Also, this is best effort. People is free to not trust our diff and go and yell their vendors until they provide a reliable mechanism.
Out of curiosity: what happens if you load a configuration that removes an entire (sub-)block?
Loading
Would that be the case? |
parsed[last] = parse_indented_config(config, leading_spaces, current_indent, True) | ||
elif leading_spaces < current_indent: | ||
config.insert(0, line) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
break
are hidden traps in general, it's good practice to avoid.
I would suggest having a simple for
loop going through the lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are break hidden traps? Is there a PEP or something I can look at? I don’t think you can build this with a for loop but if you have ideas I am all ears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what is the argument here a break
is very common pattern. Why should they be avoided?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using / avoiding break tends to be very subjective (everyone has a different opinion: https://stackoverflow.com/questions/3922599/is-it-a-bad-practice-to-use-break-in-a-for-loop, https://ubuntuforums.org/showthread.php?t=810001), so this is more like a suggestion.
I tend to avoid, in particular when the length is deterministic, like here (i.e., you are not waiting for some input you don't know how much it is going to take, but the config
has a fixed size).
In this case, I think it is even simpler, as you can replace the break
with return parsed
.
I've dug a bit and finally found what I was looking for: http://wiki.c2.com/?IsBreakStatementArchaic, although speaking mainly about the switch-case, it covers also the usage inside the loops; it's a good article, well explained.
Some snippets:
It should be noted that break originally existed because C was originally (and for the most part still is) essentially a mid-high-level portable assembler.
I don't recommend that it be removed from the language, only that a better alternative is also offered. In other words, extend the language with a better construct. The old one is simply deprecated, but supported.
And I think there are very good alternatives in Python.
Anyway, the main reasoning on why better return
in the favor of break
:
Use a return statement, which causes execution to leave the current subroutine and resume at the point in the code immediately after where the subroutine was called (return address).
Which is a bit more optimized approach rather than a forced interrupt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
particular when the length is deterministic, like here
That's where you are wrong. The length is not deterministic. I don't know where the block ends until I find the end of the block. This is a parser, I am consuming tokens and building an object as I make sense of the unstructured text. I have no prior knowledge or clue of how the original text look like until I look at it and I certainly can't predict how the resulting object will look like.
In this case, I think it is even simpler, as you can replace the break with return parsed.
Consider this case:
while True:
if not data:
break
token = data.pop(0)
# things happen
if condition:
break
return a
What you are proposing is to do this instead:
while True:
if not data:
return a
token = data.pop(0)
# things happen
if condition:
return a
Which is IMHO duplicating code because I have the same return statement twice and if I now have to process a
before returning I will have to duplicate more code:
while True:
if not data:
# code to process a
return a
token = data.pop(0)
# things happen
if condition:
# code to process a
return a
With a break statement the code is as clear and I avoid duplicating code:
while True:
if not data:
break
token = data.pop(0)
# things happen
if condition:
break
# code to process a
return a
This is an extremely common pattern in parsers and search algorithms:
If you look at the stdlib and search for parse
you will see most parsers replicate that pattern.
And I think there are very good alternatives in Python.
For example?
Which is a bit more optimized approach rather than a forced interrupt.
I tested that statement:
def test_break():
while True:
break
return
for _ in range(0, 10000000):
test_break()
python test_break.py 1.71s user 0.15s system 99% cpu 1.874 total
def test_return():
while True:
return
for _ in range(0, 10000000):
test_return()
python test_return.py 1.62s user 0.15s system 99% cpu 1.778 total
So, every 10.000.0000 executions we save 0.1s. I think we can leave with the extra cost of break
if we save a few bytes of memory and avoid duplicating code :P
@ogenstad, yes an ordereddict is probably a good idea. Will look at your PR and at changing the logic to use an ordereddict to preserve order. |
@mirceaulinic regarding the ntp case you gave, that’s exactly how it works. Check test case 1, there is a logging and a bgp neighbor examples that I think behave as you described. |
return parsed | ||
|
||
|
||
def _can_have_multiple(command): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a platform dependent list, although most Cisco rip-offs use pretty much the same syntax. For NXOS "ip access-list", ""object-group ip" should be added. I wonder if we can scan through the show running config lines or candidate config lines, split the words and check if one word, two words, and three words key have collusion (multiple occurences) if so build this EXACT_MATCHES list dynamically from this collusion list. This will only work if existing config (running or candidate) have collusions though.
* Add vlan example * Fix typo
This framework does two things:
https://github.com/napalm-automation/napalm-base/compare/indent_differ?expand=1#diff-cae653bb93fa20df55fada139a153965R22
IndentedConfig
objects.What I would like to do is have people help testing (2). To do it you need to create a folder like this one;
test/unit/test_indent_differ/test_case_1/
. Inside, you have to put three files:cli.1.show_running_config.0
. A configuration file retrieved from a device with a poor configuration model like IOS uses.candidate.txt
. A bunch of commands you want to merge on the device.diff.txt
. The expected diff.Finally, run the test with
py.test test/unit/test_indent_differ.py::Test_Indent_differ
Feel free to send PRs against the branch
indent_differ
for tests both passing and failing. The more we have the more assurance we will have this actually works.