-
xaero
is there a list of the last remaining closed source bits (if any) required to build illumos?
-
andyf
-
xaero
andyf: thanks!
-
sommerfeld
Hmm, I think I have some more work to do on the python backtraces - line numbers are sometimes quite wrong..
-
jclulow
jbk: That's why src.illumos.org stopped indexing the other things :P
-
jbk
i can totally understand :)
-
jbk
of the 10-30s / file it was taking on my system, I am curious to know what the breakdown of what it's doing
-
gitomat
[illumos-gate] 16268 libadutils: open_conn could print additional ldap error information -- Matt Barden <mbarden⊙rc>
-
sommerfeld
so it's probably the wrong way to do it but I have a kludge to work around the .cold thing.
code.illumos.org/c/illumos-gate/+/3287
-
fenix
→ CODE REVIEW 3287: XXXXX CTF lookup kludge for .cold symbols (NEW)
-
rmustacc
So is there no dwarf being generated for the .cold functions?
-
rmustacc
Put differently who generates the .cold symbols?
-
rmustacc
If this is going to end up in libproc, we're going to want to constraint this assumption as the python logic is definitely not a universal assumption.
-
rmustacc
(Unless there is some broader generalization)
-
sommerfeld
GCC is generating the .cold symbols when run with some options that enable profile-guided optimization.
-
sommerfeld
I'm not sure how to check the dwarf for that
-
sommerfeld
yes
-
sommerfeld
oops
-
rmustacc
OK, so I'd start with dwarfdump I guess.
-
rmustacc
And you're still doing the ctfconvert/ctfmerge style pass right?
-
rmustacc
So presuming that you aren't stripping this, I'd use the dwarfdump either from the libdwarf or llvm utils and see if there's an entry related to that with type information.
-
sommerfeld
I switched to the single ctfconvert at the end on every elf file in DESTDIR
-
rmustacc
Does that occur before or after pgo?
-
sommerfeld
After
-
rmustacc
OK. I guess then the thing we'll need to figure out is how gcc is representing the relationship between the functions in dwarf.
-
sommerfeld
PGO process: build with one set of options to generate a profile; run a set of tests to exercise the first build and dump a profile; then rebuild with a different set of options to generate a new binary influenced by the profile.
-
rmustacc
Maybe there's a DW_AT_abstract_origin.
-
rmustacc
Though not sure if that'll be in dwarf 2.
-
sommerfeld
I switched to dwarf 4 at your suggestion
-
rmustacc
So I think the question is what dwarf if anything is gcc generating for .cold.
-
sommerfeld
what I see for the function of interest is two entries in DW_AT_ranges, one which has a low bound matching _PyEval_EvalFrameDefault, and the other matching _PyEval_EvalFrameDefault.cold
-
rmustacc
Yeah, though we don't really look at the ranges as usually that's for a different part.
-
rmustacc
I'd exepct there's a DW_TAG_subprogram for the function.
-
rmustacc
Able to toss one of these somewhere and maybe I can look in parallel a bit later today?
-
sommerfeld
here are the two DW_TAG_subprogram entries for the function of interest:
gist.github.com/Bill-Sommerfeld/e41f630ede7dd5e15d966e3cd5866a26
-
sommerfeld
the whole dump is large (80MB)
-
sommerfeld
I don't see any mentions of .cold in the dwarfdump output
-
rmustacc
Sure, I just meant the raw binary.
-
rmustacc
Is the .cold really going to be a 1:1 entry point?
-
rmustacc
Or is it just a portion of it that's cold?
-
sommerfeld
it's just part of the function that has been classed as cold
-
rmustacc
If that's the case I'm not sure we can assume the arguments are the same.
-
sommerfeld
it's part of the code of the function that has been relegated to the cold path.
-
rmustacc
Yes, I get that.
-
rmustacc
I just mean that if you were to lie about that in libdtrace, it would do the wrong thing.
-
rmustacc
Well, libproc to libdtrace.
-
rmustacc
The pid provider would think oh on entry to this function I have these args which it doesn't.
-
sommerfeld
rather than all the function being in one contiguous range of addresses, it's in two discontiguous pieces.
-
rmustacc
Yes, I understand.
-
rmustacc
I'm just saying that lying about the argument types is going to confuse things.
-
rmustacc
Particularly if there's ever a call into that other symbol.
-
sommerfeld
it would be wrong for dtrace to think the .cold thing is an entry point -- because it's the middle of the function.
-
sommerfeld
I wouldn't expect any *calls* to foo.cold. Just jumps.
-
sommerfeld
from within the .cold segment and the main segment of the function.
-
rmustacc
Sure, but that's not what your change told it.
-
rmustacc
It's a valid entry in the symbol table, so how can it know those special semantics?
-
rmustacc
The pid provider can happily instrument it. Maybe it won't find a synthetic entry probe.
-
sommerfeld
from the point of view of read_args(), if PC is inside either range, you're in a function of 3 args of these types..
-
rmustacc
Sorry, I realize this is just me coming off as saying no. Not my intent. Just want to figure out how we get there.
-
rmustacc
But that's only because you know that in this case .cold has those semantics.
-
rmustacc
That is you know that this binary was produced with pgo and that gcc has promised that, I think?
-
sommerfeld
so the trick is to how to get this from the elf/dwarf records into CTF, that this function has two extents
-
sommerfeld
rather than the usual one
-
rmustacc
The problem is that it's two different symbols.
-
rmustacc
And there is no real .cold function based on everything you described.
-
rmustacc
What
-
rmustacc
Just for my own curiosity, what's the disassembled cold version look like?
-
sommerfeld
basically in foo, some conditional branches to foo.cold+offset (for varying offsets)
-
rmustacc
If you can't give me the binary can I get the disasm?
-
sommerfeld
I can get you the binary - give me a sec
-
rmustacc
Take your time.
-
sommerfeld
-
sommerfeld
sha256 should be a103b7431ad8f8be82faa0c53cd25c3e57da633ffe09b49d168d2f431ba6d32b
-
sommerfeld
note that there are many foo / foo.cold pairs and _PyEval_EvalFrameDefault is probably one of the most complicated ones in there
-
sommerfeld
as it's the core of the bytecode engine
-
rmustacc
Thanks, I have it.
-
sommerfeld
which is why it's interesting for pstack -- every python function evaluation goes through it.
-
rmustacc
So there are definitely some challenges writ large here as we look at as even if none of these have entry probes, they do definitely have return paths.
-
rmustacc
It sems likely that gcc won't trick us up by pushing a frame pointer in these, though I'm not sure it makes this promise.
-
rmustacc
As if it did, that would throw off saveargs.
-
sommerfeld
yes, you'd expect the unlikely return paths to be relegated to the .cold section.
-
rmustacc
I've learned to try not to expect too much from a compiler.
-
sommerfeld
Indeed.
-
sommerfeld
give it an unrepresentative profile and I'd expect the hot return paths to be sent to .cold...
-
sommerfeld
I've seen FDO cause some really unpredictable performance regressions build to build because sometimes it nailed it and sometimes it didn't.
-
sommerfeld
actually, given the documentation for gcc's -fshrink-wrap I can totally believe there might be frame creation in the .cold extent of the function.
-
sommerfeld
-
sommerfeld
ah, -freorder-functions enables this splitting even absent profile feedback. minimal example:
godbolt.org/z/9rYbzoh4x
-
sommerfeld
minimal example without compiler warnings:
godbolt.org/z/GKdzzvj88
-
rmustacc
Yeah, that's why we disable this stuff in illumos.
-
gitomat
[illumos-gate] 16250 use zfs_nicebytes() to print human-readable sizes -- Toomas Soome <tsoome⊙mc>