Very funny, gdb. Ve-ery funny

DannyBee · on Jan 11, 2014

It'd be nice to know what version of GDB this was.

Years ago, I used to be the C++ maintainer for GDB. By the time I left, things like doing this on a live process worked about 90% of the time (up from 0%). It turns out what he's trying to do is harder than one would think, but it should actually work.

Part of the reason it doesn't really work all that often on coredumps is because you often need access to vtables, thunk functions (which you could theoretically execute without a live process in most ABI's, but it doesn't know this). Depending on the class, there may also be runonce constructors (for static data and associated guard variables), etc.

eliasmacpherson · on Jan 11, 2014

https://stackoverflow.com/questions/253099/how-do-i-print-th... I think this might possibly work for the core dump? I don't have time to check now myself...

eliasmacpherson · on Jan 12, 2014

Took the time to check and it doesn't work with a core dump. Oh well.

MichaelMoser123 · on Jan 11, 2014

std::vector does not have any virtual functions, so how comes it needs the vtable for this?

http://www.yolinux.com/TUTORIALS/src/dbinit_stl_views-1.03.t...

Here they have some functions that inspect vector by means of _M_impl and friends, why can't gdb do this?

DannyBee · on Jan 11, 2014

I was explaining the general problem.

As for std::vector, GDB can and does do this now, through python based pretty printers.

That gdb script is completely unnecessary for this now. Most distros ship with gdb + appropriate pretty printers pre-installed, but they are also available in the libstdc++ repo.

See https://sourceware.org/gdb/wiki/STLSupport

If you use this, print on a vector just works, no extra commands or anything needed.

makomk · on Jan 12, 2014

The pretty printers let you print the entire vector, but do they let you access individual elements within the (potentially very large) vector? That's what this blog post is complaining about not being able to do.

EliRivers · on Jan 12, 2014

Works absolutely fine on my very standard, almost vanilla installation.

Code:

  #include <vector>
  using namespace std;
  int main()
  {
    vector<int> beans;
    beans.resize(5);
    beans[3] = 4;
    beans [4] = 66;
  }

Build command:

  g++ -g 156.cpp

gdb debugging session:

  GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
  SKIPPED OUT WARRANT INFO HERE
  Reading symbols from /badcode/a.out...done.
  (gdb) break 156.cpp:10
  Breakpoint 1 at 0x4008e6: file 156.cpp, line 10.
  (gdb) run
  Starting program: /badcode/a.out

  Breakpoint 1, main () at 156.cpp:10
  10	  beans [4] = 66;
  (gdb) print beans[3]
  $2 = (int &) @0x60401c: 4
  (gdb) print beans[4]
  $3 = (int &) @0x604020: 0

DannyBee · on Jan 12, 2014

So, I thought the post above me asked a different question, related to printing out vectors in general :) Possibly they didn't mean to, but asking gdb to "p v[0]" is a function call execution of operator[]. As mentioned in the grandparent, this should work fine on a live process, but is harder on a not live one.

It's possible the parent meant "why can't GDB make v[0] do an inspection directly on the structure". The simple answer is: "It can't possibly know what the function call does". So you'd have to teach it, specifically, that std::vector is special, or provide a mechanism to override the native calls with python scripting (IE be able to say "here is a replacement for T &operator[int] on std::vector that is in python, and works without a live process").

GDB actually does real overload resolution (it has to, otherwise in some 'simple' cases, it'd have to ask you 100 questions about what functions you are trying to call), making that strategy more complicated.

If your only mechanism for reproduction is a core dump, yes this is a pain in the ass. But this shouldn't be true, even remotely, since you always have things like gdbserver/etc.

dded · on Jan 11, 2014

In the hw world we debug with waveform viewers (Simvision, DVE, and the like). You can see the value of any signal at any point in time. Move forward and backward in time. Search for when a bus switches, or even for when it switches to a particular value. You can even say "show me all points at which foo == 16'hdead and bar == 16'hbeef and x is high but y is not 8'h7."

Now I understand it's a different problem, but sometimes I think you sw guys live in the dark ages.

ithinkso · on Jan 11, 2014

No, sw guys does not live in the dark ages and if you're saying about debugging waveforms by 'value at any point in time' that means for me that is you who need to change his place in history. I'm more hw guy than sw but I see the problem here, debugging is not "show me all points at which foo ==" so you have no idea what are you talking about.

moron4hire · on Jan 11, 2014

Oh come on. HW developers don't even use source control.

dded · on Jan 11, 2014

Well, things have improved a lot, but there's way more truth in what you say than there should be.

roel_v · on Jan 11, 2014

"but sometimes I think you sw guys live in the dark ages."

Well those using gdb are, yes. (ducks)

verroq · on Jan 12, 2014

Is there anything that works on Linux other than the steaming pile of crap commonly know as Eclipse?

girvo · on Jan 12, 2014

Lldb, perhaps?

cfallin · on Jan 11, 2014

I did grad school in computer architecture and did a lot of cycle-level debugging in simulators, and I agree this mode of debugging is very very useful.

I think that the closest analogue to a waveform view in software may actually be printf()/log-based debugging, in the sense that there's a sort of temporal vs. spatial tradeoff going on: with waveform views or logs (temporal view), you see only the signals/data you've chosen to post, but you can easily see sequences through time, which is useful for many sorts of systems. With a traditional debugger (spatial), you can poke around in memory arbitrarily at a breakpoint, but it's more difficult to see time (you have to explicitly step forward, and usually can't back up). Both are useful.

tlarkworthy · on Jan 11, 2014

we have conditional breakpoints. We do, unfortunately, always have to simulate the program in the forward direction from the beginning though, yes.

Actually, there are a few debuggers that can step backwards, but its not mainstream and the memory overhead is apparently enormous.

plorkyeran · on Jan 11, 2014

GDB's had reverse-step since 7.0 (released in 2009).

moron4hire · on Jan 12, 2014

That isn't a fair comparison because your scope is only showing you an extremely precise, narrow view of the problem. It would be like having a software debugger that could show you the past and present state for all time slices of the values of the processor registers. For application-level debugging. it would be completely useless.

hansjorg · on Jan 11, 2014

> There is this crazy effort underway to make every linux process serializable: allow not only memory be [un]dumped but all open file descriptors status (including bringing socked in the same mode) and other environment too.

Anyone got any pointers on this?

phaker · on Jan 11, 2014

This is called application or process checkpointing. There are/were lots of attempts to get it working under linux, some are more some less complete. It was a hot topic several years ago, but the interest sort of fizzled out since then because virtual machines sovle the same problem. A vm has more overhead but freezing a vm is also much more reliable.

This wikipedia article mentions few approaches: http://en.wikipedia.org/wiki/Application_checkpointing

This page has many more projects, though most are probably dead by now: http://checkpointing.org/

Both are incomplete, googling "linux checkpointing" returns things not mentioned in either, like https://ckpt.wiki.kernel.org/index.php/Main_Page

jevinskie · on Jan 11, 2014

Yep, this is very important for grid computing where a user may log into an organization's workstation and the grid job running on the workstation needs to be evicted. With process checkpointing, you can resume the work later (perhaps even on another node). Without it, all of the intermediate progress is discarded.

mayoff · on Jan 11, 2014

> Checkpoint/Restore In Userspace, or CRIU (pronounced kree-oo, IPA: /krɪʊ/, Russian: криу), is a software tool for Linux operating system. Using this tool, you can freeze a running application (or part of it) and checkpoint it to a hard drive as a collection of files. You can then use the files to restore and run the application from the point it was frozen at. The distinctive feature of the CRIU project is that it is mainly implemented in user space.

http://criu.org/Main_Page

justincormack · on Jan 11, 2014

As always http://lwn.net has the best overviews, eg start here https://lwn.net/Articles/525675/

uhno · on Jan 11, 2014

Another userspace tool for process checkpointing is DMTCP: http://dmtcp.sourceforge.net/. From the About page:

"DMTCP (Distributed MultiThreaded Checkpointing) is a tool to transparently checkpoint the state of multiple simultaneous applications, including multi-threaded and distributed applications. It operates directly on the user binary executable, without any Linux kernel modules or other kernel modifications."

gaius · on Jan 11, 2014

Pointers, or references? ;-)

haberman · on Jan 11, 2014

> still-not-boundary-checked [...] std::vector

Huh? vector::at() is boundary-checked. So you can choose whether you want to pay for boundary-checking or not.

roel_v · on Jan 11, 2014

You can't switch between .at() and [] between two builds. There is a lot to be said about MSVC's checked iterators, but it sure does catch a lot of mistakes during debug builds. Any sane STL implementation should have something similar.

sltkr · on Jan 12, 2014

FYI, with libstdc++ (GCC) you can define _GLIBCXX_DEBUG and get a lot of runtime checks too, including bounds checking on vectors.

http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt12ch30s...

jzwinck · on Jan 11, 2014

#define at operator []

I kid...mostly.

mappu · on Jan 11, 2014

>Nobody's holding a gnu to your head.

comment reproduced here without further context

tekacs · on Jan 12, 2014

I noticed the same and was resisting the urge to post it here. :)

I certainly hope it was intentional. :D

exDM69 · on Jan 12, 2014

Yes, GDB can feel a little crappy at times. But on the other hand it is quite versatile in that it can debug programs written in several different languages running on several different hw architectures, user space or kernel space, code running on bare metal or in QEMU, etc and remote debugged over a serial line or network. All this comes at a price.

GDB can never be as nice as a language specific, graphical point and click debuggers that some programming environments have which are built in conjunction with the compiler/interpreter for that language.

And the problem is not only GDB, it's the whole infrastructure that is involved, binary formats, debug symbols and protocols, binutils, etc. It is easy to point the finger at GDB but there's only so much that GDB can do in the environment(s) that it works in.

There's plenty of room for improvement but it's not only GDB that would require work, it's the entire software ecosystem.

beefhash · on Jan 13, 2014

Well, I guess it fits right in with GNU: If it sucks, add more features and port it to more platforms.

stcredzero · on Jan 11, 2014

From the comments:

Debugging std::vector isn't too bad. You can dig in to the...

Whenever a programmer says "[X] isn't too bad" this is the possible signature of an opportunity.

pjmlp · on Jan 11, 2014

Yet, C++ debuggers on Windows and Mac OS X do this since ages.

Timmmmbob · on Jan 11, 2014

Yeah this is clearly a problem with gdb, not std::vector. Really gdb is pretty rubbish and hasn't been improved for many years. LLDB should be much better.

marcosdumay · on Jan 11, 2014

That's a "problem" with the compiler, not gdb. Also, the "problem" is that the compiler actually optimized the code, when ordered to do so, instead of just pretending it did (like MS tools do when debuging). It's also a very usefull "problem" to have around.

I really doubt LLDB can discover what code operators execute if the compiler does not include it at the binary.

krakensden · on Jan 12, 2014

When debugging optimized C++ in Visual Studio you'll run into the same problem. The 'see ASM' feature is easy enough to use that it helps, but, there's no free lunch.

comex · on Jan 11, 2014

Probably the compiler should both inline it and include a standalone version when debugging is enabled.

hdevalence · on Jan 12, 2014

No, because you need to debug the code you're actually running, not some other code that's easier to debug.

comex · on Jan 12, 2014

I believe we're talking about a function such as std::vector::operator[]. You wouldn't usually be debugging a particular attempt to access an element of a vector, you'd have a vector and want to easily see what's in it, using a debugger-constructed call to the function.

alextingle · on Jan 12, 2014

> gdb is pretty rubbish and hasn't been improved for many years

That's such a dirty lie. Example: GDB has reverse step. That's such a amazing debugging feature that VisualStudio users often refuse to believe that it's real, when you try to describe it to them.

fafner · on Jan 12, 2014

Bullshit. GDB is improving constantly. Last time I checked LLDB was inferior. So stop spreading lies!

comex · on Jan 12, 2014

I downvoted you because of the tone of your post, but I do tend to agree with you - almost everything about LLDB strikes me as pretty crappy, although unfortunately the same applies to GDB. In this case, LLDB is no better:

    (lldb) print v[0]
    error: call to a function 'std::__1::vector<int, std::__1::allocator<int> >::operator[](unsigned long)' ('_ZNSt3__16vectorIiNS_9allocatorIiEEEixEm') that is not present in the target
    error: The expression could not be prepared to run in the target
    (lldb) version
    lldb-300.2.53

pilooch · on Jan 11, 2014

don't know how others do but I find I have to recompile a program without optimization flags in order to be able to fully debug a live C++ program. Understandable but annoying.

pjmlp · on Jan 11, 2014

This applies to any language with AOT compilation to native code. There are lots of optimizations that cannot be properly mapped back to the original source code.

In most languages compilers are allowed to do whatever they want as long as the semantic outcome stays the same.

JoeAltmaier · on Jan 11, 2014

Yeah I avoid gdb like the plague. Any other platform, any other debugger. Even Microsoft VS does better.

TwoBit · on Jan 11, 2014

"Even Microsoft VS does it better?" I have yet to see any debugger that's better than Microsoft's. Their compiler is behind GCC, clang, and others, but their debugging environment is unmatched.

seanmcdirmid · on Jan 12, 2014

JVM code hotswapping is still more feature complete than edit and continue for managed code, which is still way too conservative to be very useful. It is not a clear overwhelming win for VS (perhaps it is for native, but I don't work with C++).

userbinator · on Jan 12, 2014

Reminds me of those quote I've heard a few times: "gdb is designed to make you hate using it, encouraging you to be more careful when you write code."

tiziano88 · on Jan 11, 2014

"Unices"???

alextingle · on Jan 12, 2014

Plural of Unix.

GFK_of_xmaspast · on Jan 11, 2014

"No really, there used to be Unices with a program called undump that did just that."

Isn't that how they used to build emacs, there was some multistage bootstrapping procedure that culminated in undump'ing a thing that got turned into a binary.

jws · on Jan 11, 2014

emacs and some of the TeX programs worked that way. Maybe they still do. I haven't looked at that in a quarter century.

To elaborate: the programs would do all of their initialization and loading of things that would be used in any run, then dump themselves. The result of undump was an executable that had already done all of its boilerplate loading and was ready to process its arguments and inputs. It was bigger, since it had a bunch of heap stored, but it was better than hammering away on your CPU at 1 million instructions per second.

eliasmacpherson · on Jan 12, 2014

That's pretty cool. Is there a way of making a java program do this? The JVM startup time on a desktop kills me.

judk · on Jan 12, 2014

Java Web start or similar. It was called TSR in the DOS world "terminate and stay resident".

stox · on Jan 12, 2014

Glad to see someone beat me to this comment. Emacs would have had a startup time measured in minutes were it not for this.