'Computer_language/Debug'에 해당되는 글 13건

  1. 2009.01.12 ns2 gdb debug 관련 파일
  2. 2009.01.12 Pedro Vale Estrela - NS2 Debug / BugFix Tutorial (OTCL + C++)
  3. 2009.01.12 GDB 잘 쓰기 2: User Defined Commands

ns2 gdb debug 관련 파일

|
introduction to ns2 : http://ub-tech.com/~mo/ns2.htm; with debug
ns2 installation
ns2 debuging

'Computer_language > Debug' 카테고리의 다른 글

Pedro Vale Estrela - NS2 Debugging Page  (0) 2009.01.12
[From NS-User]  (0) 2009.01.12
NS2 Programming  (0) 2009.01.12
Pedro Vale Estrela - NS2 Debug / BugFix Tutorial (OTCL + C++)  (0) 2009.01.12
GDB 잘 쓰기 2: User Defined Commands  (0) 2009.01.12
And

Pedro Vale Estrela - NS2 Debug / BugFix Tutorial (OTCL + C++)

|
Pedro Vale Estrela - NS2 Debug / BugFix Tutorial (OTCL + C++)
This tutorial will focus on how to use the oTcl and C++ debugger tools to find a bug in NS2.28 or earlier.
(recent CVS snapshots and future 2.29 version will have it corrected, as the patch was already been applied to the CVS tree.
Thus, this tutorial will guide you in the typical debug process that are useful for a variety of situations.

Files and Patches (contains the scripts patches mentioned in these pages)
Contact: pedro.estrela@inesc.pt


--------------------------------------------------------------------------------


NOTE 1: The tutorial mentions several scripts. These are available at the files directory. I've also made a compressed file that has everything you'll need on this guide test-suite-hier-routing-bug.zip. It will also depend on recent versions of my ns2_shared_procs.tcl file.
NOTE 2: This tutorial will give greater detail on the otcl debugging part; However, an experienced NS developer could directly jump to the C++ debugging part, by closely studing the call stack dump information.
NOTE 3: A different way to modify the built-in tcl functions would be do modify them directly in the tcl source files and recompiling NS. However, the method outlined below is preferable to begginners, as it doesn't require recompilation, and doesn't changes the existing code (resuting in trivial backtraction, if necessary).
NOTE 4: Like my other tutorial, I'll present a complete script and an image for each step. however, one should try to make the modifications required by hand, to get a much better understanding on modifiyng NS.


--------------------------------------------------------------------------------

Step 0 - Prologue
The bug that will be investigated in this tutorial appeared when I've tried to add dynamic routing capabilities (eg, possibility to simulate link failures recovery) to a fairly complex script that featured a large topology of wired links, coupled with several wireless links (both base stations and pure mobile nodes). Another relevant point is that the script used hierarchical routing in all the nodes, for Mobile IP usage.
The said script was working 100% until I've added to it instructions to simulate a link failure on one of the wired links. As explained in this section of the Marc Greis's tutorial, all that is required is to add "$ns rtmodel-at time up|down node1 node2" commands to the script.
However, the problem was when I've enabled the "session" dynamic routing to use the alternative paths of the wired topology. (e.g., using "$ns rtproto Session" at the start of the script).
At run time, the simulator crashed in the middle of the simulation with the following error: test-suite-hier-routing.error1.txt.

Step 1 - Choosing a simpler scenario that is known to be correct
To try to isolate the bug, a common heuristic is to try to simplify the scenario, by removing unused parts that (hopefully) are unrelated to it. On the above example, I've suspected that the bug was somewhere in the interaction of the dynamic routing and the hierarchical routing (as the simulator crashed when the wired link went down); In that case, the wireless nodes and complex topologies only further obscured the real problem. (As it will be shown later, this supposition was correct).
On the other hand, I also wanted to validate my own script, has I could be doing something in it that could be corrupting the simulator.

Thus, one good approach to find the bug is to start from a known correct scenario, and slowly introduce minimal features to force the bug to appear. The best example for this are the standard test suites included in NS2, which are used to validate the simulator itself, againt the most recent modifications and patches.
(note: as explained here, these tests suites are the only scripts that are guaranteed to use the latest APIs; on the contrary, the examples in "ns/tcl/ex" and the Marc Greis's tutorial is known to be out-of-date, especially on the wireless examples).

Searching in "ns/tcl/test", I've found that the only test suite that used Hiererarchical routing was "test-suite-hier-routing". This test used a simple non-redundant topology (eg, only direct paths) and used regular static routing. This has produced a topology with 9 nodes.

Script: test-suite-hier-routing_1.tcl
Result: hier_step1.gif

To run the simulation: "ns test-suite-hier-routing.tcl hier-simple"
To view the simulation in nam: "nam temp.rands.nam"

Step 2 - Making the bug appear in the simpler scenario
Now let's try adding a new link between nodes 5 and 7, and make it go down at time 2. This is attained by introducing these 2 lines in the script, in the instproc "TestSuite instproc init".
$ns_ duplex-link $n_(5) $n_(7) 5Mb 2ms DropTail
$ns_ rtmodel-at 2 down $n_(5) $n_(7)

Use the same commands as before with the new script. Using it, the traffic first goes to the new link, and at time 2, all packets are lost at the new link, making nodes 7 8 and 9 unreachable.
Script: test-suite-hier-routing_2.tcl
Result: hier_step2.gif

Now, let's use dynamic routing to correct this, choosing type Session. Just add "$ns_ rtproto Session" after the simulator object creation, in init-simulator {}.
If you now run the new script, it will crash with the exact same error as before. Good work! Now we have a much simpler scenario which is sufficient to trigger the bug, and will be much easier to debug!
Note that only now you should ask on the NS2 mailing lists concerning about the bug that you've found, to know if somebody has made any work for its fix. It is fairly important to use a simple scnenario as the one exaplined here. As an example, check the email I've sent to the NS developer's mailing list for this very bug: Bug report
Script: test-suite-hier-routing_3.tcl
Error (Call Trace): test-suite-hier-routing.error2.txt

Step 3 - Getting to know what is going on at the beginning of the simulation
In this section we'll take an inside look on the TCL objects that are created by the script, to get a insight view of the inner workings of the simulator. I assume that you've followed and experimented my tutorial on otcl debugging.

The ideia will be to stop the simulator immediately before the simulation starts (eg, before $ns run). For this:
a) modify the script to include a new MashInspector object in "Test/hier-simple instproc run", before "$ns run";
b) make it stop before "$ns run" with "debug 1";
c) modify it to run the "hier-simple" test, ignoring command line parameters (check runtest() of the resulting script if its too dificult);

Then use the resulting script as follows:

a) start nstk without parameters. It should open the tkcon console.
b) start the script ("source test-suite-hier-routing_4.tcl").

You'll now see the Mash's Object Inspector that you can use to peek into the otcl objects that are created at the start of the simulation. In folowing figure, I'm inspecting the main "simulator" object, which is created in the script by "set ns [new Simulator]". For this, I've selected the "Simulator" class on the first column, and its unique instance on the fourth column (in my case, it was object _o5).
At this stage you'll find _o5's private variables in the last column, namely the node information (array Node_[]), each link (link_[]), and private variables that contain references to the name of other core objects, namely the scheduler in use, type of trace in use, etc. Another important column is the second, as it contains the references to the procs available to the selected object. If you click on each, you'll check the source code for it (This will be very important on the next step).

Navigating with the references, you can now inspect each object in succession. For example, clicking on the private variable "routingTable_", you are moved to an instance of "RoutingLogic", that contains a private variables rtprotos_(Session). This is enough to confirm that you are using correctly the "Session" type of dynamic routing.
Using this technique is useful to check the inner state of the objects created by your script before the simulation, to make sure that these start as intended.
However, our specific error ocours at run-time 2.0, when the link goes down. Thus, our next step is to stop the simulator at exactly this event.

Script: test-suite-hier-routing_4.tcl
Image: hier_step3.gif

Step 4 - Getting to now what is going on immediately before the crash at runtime.
The ideia to debug at run-time is to insert "debug 1" commands at interesting points of the code, to break the execution at runtime. For this, we'll check the tcl call stack
that simulator dumps when it crashes; it starts from the innermost tcl procedure that crashed, then the function that called it, etc, until the first tcl function that triggered the calling stack.
In our case, the outermost function (eg the first) is "runq" proc. Note that there is no easy reference to the actual tcl source code file that contains this function; for this, you should make a recursive grep of teh string "runq" in the whole ns/tcl sourcetree:

ns/tcl> grep -d recurse "runq" *

You'll be able to check that this procedure resides inside the file "rtglib/dynamics.tcl". Next, you could simply modify proc "runq" and recompile NS. However, TCL enables to replace any given proc in run-time; thus, to avoid modifying the ns2 core files, we'll copy the "runq" proc to our script, and insert the "debug 1" instruction on our private copy.

Script: test-suite-hier-routing_5.tcl
Image: hier_step4.gif

The next image shows actual interaction at run-time. Notice as I've confirmed what is the current simulation time when the debugger breaks in (e.g., at 2 seconds); for that, I've just called the "now" proc of the simulator object on the bottom evaluation line (also check that I'm showing the actual code for the "now" procedure).

Now you can position yourself on the current running object, in order to inspect it. For this, run "puts $self" on the debugger window and find the object name on the list of all instances. Then open the "runq" procedure - see the folowing image.
Image: hier_step5.gif

You are now on an rtQueue object, that has a list of events (see array rtq_[]). You can now do a step by step trace in the debugger window, and check the code to be executed in the Mashinspector window at each time (using the enter key in the debugger console). This will take you, step by step, to all procs that are mentioned on the call trace after the crash. However, at any time you can check the internal state of the objects, to check for logical bugs.

Using these techniques, and more closer "debug 1" statements up the stack, you'll eventually reach the "simulator compute-hier-routes" function, and conclude that the bug is triggered when the "$r hier-reset $srcID $dstID" line is called. (the compute-hier-routes is in ns/tcl/lib/ns-route; use recursive grep to find its location).
The following script has debug code immediately before this function call, to produce the correponding screenshot:
Script: test-suite-hier-routing_6.tcl
Image: hier_step6.gif

Here, i'm checking what are the values of the parameters for the link (_o12), the source node (1.1.0) and the destination node (1.0.0). As all these values are correct, lets now check the proc itself (hier-reset).

Step 5 - Understanding Shared C++ / TCL procs
For this, we'll go to object _o12 and check its procs. However, as it can be seen on image7, this proc doesn't appear in the list. This happens because of a powerful (but confusing to beginners) mechanism that simplifies C++ procedures calling in TCL.

When an unexisting procedure is called to an otcl object, the tclcl library that is part of the core ns modules calls the "*command(argv argc)" of the corresponding C++ object, with all the parameters as string.
This function inspects the command name in the argument, and if its know, executes it; if not, an error is returned.

This way, the available procs that can an object can execute are:
- defined in its own oTCL class;
- heritaged from parent otcl super classes;
- contained in the C++ command() of the corresponding C++ class ;

However, only the first types appears directly in the object inspector; the heritaged otcl procedures are visible if one chooses the parent classes in the heritage column (3rd column). As the hier-reset proc isnt present in the otcl class or super classes, it has to be in the C++ code.

For this, make a recursive grep from the base of the ns2 tree:
ns2> grep -d recurse "hier-reset" *

(NOTE: must faster way would be to only check for .cc files, for example:
grep -d recurse "hier-reset" *.c
grep -d recurse "hier-reset" */*.c)

The recursive grep tells us that the function is inside RouteLogic::command(argc, argv), on routing/route.cc.

The relevant part is:
...
} else if (strcmp(argv[1], "hier-reset") == 0) {
int i;
int src_addr[SMALL_LEN], dst_addr[SMALL_LEN];

str2address(argv, src_addr, dst_addr);
// assuming node-node addresses (instead of
// node-cluster or node-domain pair)
// are sent for hier_reset
for (i=0; i < level_; i++)
if (src_addr[i]<=0 || dst_addr[i]<=0){
tcl.result ("negative node number");
return (TCL_ERROR);
}
hier_reset(src_addr, dst_addr);
} else if (strcmp(argv[1], "hier-lookup") == 0) {
...

We'll now proceed into C++ level debugging. However, you should now comment the lines that called the otcl debugger.
Script: test-suite-hier-routing_7.tcl

Step 6 - Move into C++ debugging
Fortunately we'll now proceed into C++ level debugging, which has much better tools for debugging. I suggest using ddd, which is a front end to gdb. (check details and tutorials here).

Start ddd, open the ns executable (menu file / open program ) then put a breakpoint in route.o's RouteLogic::command().
(menu file / open source / route.cc )
Image: hier_step8.gif

Now lets run the program (menu program / run / arguments: test-suite-hier-routing_7.tcl))
Notice how you'll have a source level debugger window that is stopped at the breakpoint.
Now, use step by step (f5), and notice how the arguments are processed; then the hier_reset() function is called, to perform the actual work.
Now notice that after hier_reset(), the control falls trough to the end of the command() function, reaching return(TclObject::command(argv, argc));
Image: hier_step9.gif

This line passes control to the TCL standard command processor, which doesnt know anything about link failures, hierarchical resets etc. Thus, this function will return an error, and the simulator will crash in run-time.

Looking for the other commands processed by this function, a simple pattern is easy to catch:
- each "if" verifies the command name (strcmp== 0);
- arguments are collected from the argv/argc array;
- a function is called that does the actual work;
- the function either returns with TCL_OK or TCL_ERROR.

However, such is not the case in our hier-reset function, as there is no return(TCL_OK) anywhere.
Thus, the control falls-through to the default behaviour, which will subsequently let to the simulator crash.

As the hier-reset is a void function, it will not have anything to return; thus, we'll arbitrate that the command() function should return TCL_OK, to indicate to tcl that it has processed the hier-reset call just fine.
As such, just insert a "return (TCL_OK);" immediately after the existing hier_reset(src_addr, dst_addr);. Then recompile the simulator and rerun the script.

You'll then check that it no longer crashes at run time, and is able to do the whole simulation without problems. Then, use nam and check that the original problem has been corrected (e.g. Hierarchical + Dynamic Routing). As you can check in the folowing image, where the link failure at time 2 is instantly "healed" by the Session routing.

Image: hier_step10.gif

Now its the time to go outside, and celebrate the bugfix that you've acheived!

Step 7 - Contribute a patch to the NS developers with your newest bug discovery
Er, actually not so fast. :-)
That celebration idea should be delayed until the WHOLE work is done. And no bug is fixed until a patch is submitted to the NS developers.

This will enable the bug to be corrected on the following version of NS2, benefiting the whole community at once; on the other hand, it saves other fellow researchers the necessary time to fix the same bug over and over, enabling actual research work to be done.

For this, I recommend the use of CVS, for you to keep track on your own modifications and bugfixes to the simulator.
Other simpler usage to make a patch is to make a comparisation of the modified source files. For this, try the folowing line

diff -C3 original unmodified source file your modified source file

...and send the result in a SHORT but CLEAR email as a bug fix to the developers.

As an example, check the patch report on this very bug: Contributed Patch


--------------------------------------------------------------------------------

Check the files, patches, etc in this directory

Go back to my NS2 page

Contact: pedro.estrela@inesc.pt
www.terraview.org Programa de apoio cartogr?co (SIG) para planeamento agricola, florestal e ambiental
사용자 삽입 이미지

'Computer_language > Debug' 카테고리의 다른 글

Pedro Vale Estrela - NS2 Debugging Page  (0) 2009.01.12
[From NS-User]  (0) 2009.01.12
NS2 Programming  (0) 2009.01.12
ns2 gdb debug 관련 파일  (0) 2009.01.12
GDB 잘 쓰기 2: User Defined Commands  (0) 2009.01.12
And

GDB 잘 쓰기 2: User Defined Commands

|

디버깅 작업 또는 프로그램의 안전성을 검사할 때 디버거를 잘 쓰면 꽤 많은 시간을 절약할 수 있습니다. 대부분 개발자들이 GDB를 써서 디버깅을 하고 있지만, GDB가 가지고 있는 강력한 기능들을 거의 쓰지 못하고 있기 때문에, 이 글에서는 자주 쓰이지는 않을 지언정, 알면 매우 도움이 되는 기능들을 위주로 살펴보겠습니다.

먼저, 이 글을 읽는 분들이 GDB의 기본적인 사용 방법 (특히 break, run, continue, file, backtrace, print 등)을 알고 있다고 가정하겠습니다. 기본적인 사용 방법을 모르신다면 Emacs/GDB/etags/cscope나 기타 GDB manual을 참고하기 바랍니다.

Breakpoints

break 명령은 대개 다음과 같이 쓸 수 있다는 것은 이미 알고 계실 것입니다:

(gdb) break                # 현재 줄에 breakpoint 설정
(gdb) break 31 # 현재 파일 31번째 줄에 breakpoint 설정
(gdb) break foo # 함수 foo에 breakpoint 설정
(gdb) break list::next # list 클래스 next 멤버 함수에 설정
(gdb) break hello.c:main # hello.c 파일의 main 함수에 설정
(gdb) break util.c:300 # util.c 파일의 300번째 줄에 설정

특히 C++의 경우, 한 클래스의 모든 멤버 함수에 breakpoint를 설정하고 검사할 필요가 있는데, 이 경우, 정규 표현식(regular expression)으로 breakpoint를 설정하는 rbreak 명령을 쓰면 편리합니다. 예를 들어 보면:

(gdb) rbreak f*o           # "f*o"를 만족하는 심볼 전체에 대해 설정
(gdb) rbreak list:: # "list::.*"를 만족하는 심볼 전체에 대해 설정

특히 위 두번째 예제를 보시면 ".*"이 항상 default로 따라 온다는 것을 알 수 있습니다. 사실 rbreak 명령에 "foo"를 준 경우 사용되는 정규 표현식은, 정확히 말하면 ".*foo.*"가 됩니다. 따라서 "foo"로 시작하는 함수 전체에 대해 breakpoint를 설정하고 싶다면, 다음처럼 쓰면 됩니다:

(gdb) rbreak ^foo

breakpoint를 설정하면, 해당 breakpoint마다 번호(BNUM)가 주어지고, 이 번호를 써서 다양한 작업을 수행할 수 있습니다. 예를 들어, 전체 breakpoint 목록을 보고 싶다면:

(gdb) info b
Num Type Disp Enb Address What
1 breakpoint keep y 0x08066b44 in eventShow() at menubar.cpp:1017
breakpoint already hit 3 time
2 breakpoint keep y 0x080b06f4 in Play() at thumbview.cpp:416
3 breakpoint keep y 0x08066e7e in ActPlay() at menubar.cpp:1085
4 breakpoint keep y 0x08059cd3 in Play_SS(int, int) at widgets.cpp:2183
(gdb)

첫번째 컬럼(Num)은 각 breakpoint에 대한 고유번호(BNUM)를 나타냅니다. 그리고 두번째 컬럼(Type)은 breakpoint인지 watchpoint인지 catchpoint인지를 나타냅니다. (watchpoint와 catchpoint는 다음에 설명..) 그리고 세번째 컬럼(Disp)은 이 breakpoint의 특징을 나타냅니다. (다음에 설명). 네번째 컬럼(Enb)는 현재 이 breakpoint가 활성화되어 있는지를 나타냅니다. 비활성화(n)로 표시된 breakpoint는 동작하지 않습니다. 활성화/비활성화는 'enable br [BNUM]' 또는 'disable br [BNUM]'으로 변경할 수 있습니다. 예를 들어 1번 breakpoint를 비활성화하고 싶다면:

(gdb) disable br 1

전체 breakpoint를 활성화하고 싶다면:

(gdb) enable br

2번, 4번 breakpoint를 비활성화하고 싶다면:

(gdb) disable br 2 4

2번부터 5번까지 breakpoint를 활성화 하고 싶다면:

(gdb) enable br 2-5

등으로 할 수 있습니다.

때때로, 딱 한 번만 쓸 breakpoint가 필요한 경우가 있습니다. 이 경우 쓸 수 있는 명령은 enable br once [BNUM] 또는 enable br delete [BNUM]을 쓸 수 있습니다. 예를 들어 아래 명령은 1번, 3번 breakpoint를 활성화하고, 사용된 경우 바로 비활성화시킵니다:

(gdb) enable br once 1 3

아래 명령은 4번 breakpoint를 활성화하고, 사용된 경우, 이 breakpoint를 삭제합니다:

(gdb) enable br delete 4

쓸모있는 기능 중 하나가 바로 breakpoint에 조건을 지정하고, 해당 조건을 만족할 경우에 멈추도록 하는 것입니다. 예를 들어 다음과 같은 코드가 있다고 가정해 봅시다:

int i = 0;

/* do something #1 */

for (i = 0; i < 1000; i++) {
/* do something #2 */
/* do something #3 */
}

이상하게도 i가 456일때 반복문 안에서 프로그램이 이상하게 동작한다고 가정해 봅시다. 이 때 "do something #2" 부분에 breakpoint를 걸었다면 (이 breakpoint의 번호는 8번이라고 가정합시다), 반복할 때마다 계속 프로그램 실행이 멈출 겁니다. 정확히 1000번 멈추겠죠. 456번까지 진행한다는 것은 매우 귀찮은 일입니다. 이 경우, 다음과 같이 조건을 지정할 수 있습니다:

(gdb) cond 8 i == 456

즉, 8번 breakpoint는 i == 456을 만족할 때에만 멈추도록 지정합니다. 조건식에는 단순한 상수 비교 이외에, 복잡한 함수 호출도 가능합니다. 예를 들면 다음과 같습니다:

(gdb) cond 8 foo(i) > bar(rand())

앞에서 예로 든 코드는 단순 반복문이기 때문에, 처음 456 - 1번에 발생하는 breakpoint는 무시하라고 지정할 수도 있습니다. 처음 N번 발생하는 breakpoint를 무시하라는 명령은 다음과 같습니다:

(gdb) ignore 8 455

즉, 8번 breakpoint는 455번 동안 무시됩니다.

또, 다음과 같은 코드를 가정해 봅시다:

int i = 0;
int j, k;
long l;

while (1) {
j = rand();
k = some_funtion(j, time());

/* do something #1 */
l = j & 0xFF00 + (int)(log(k) * 3.2108) - ...;

if (some_condition)
break;
}

위 코드는 j와 k가 실행할 때마다 값이 변합니다. 그리고 이상하게도 j < k 일때 변수 l이 이상한 결과를 가지는 것 같지만, 확실하지는 않습니다. 우리가 확신할 수 있는 것은 j < k일 경우, l은 항상 양수이어야 한다는 것입니다. 그래서 l의 값이 전체 반복을 끝낼 동안 어떤 값을 가지고 있는지 검사해보고 싶습니다. 이 경우 해당 breakpoint에서 멈출 때, 특정 명령을 수행하도록 하는 GDB 명령인 commands를 쓰면 됩니다.

일단 "l = j & 0xFF00..." 부분에 breakpoint를 걸고 (9번 breakpoint라고 가정), 다음 명령을 내립니다:

(gdb) commands 9
Type commands for when breakpoint 9 is hit, one per line.
End with a line saying just "end".
>silent
>if j < k
>printf "l is %d\n", l
>end
>cont
>end

대충 눈치가 빠른 분은 아시겠지만 'commands [BNUM] ... end'는, BNUM breakpoint에서 멈췄을 때, "..."에 지정한 GDB 명령들을 수행합니다. 일단 silent 명령으로 명령 자체가 출력되지 않도록 한 다음, GDB printf 명령으로 변수 l 값을 출력합니다. 그리고 continue 명령으로 계속 프로그램을 진행하도록 합니다. 그 결과, 프로그램을 실행할 경우, breakpoint에서 멈추고 l 값을 출력한 다음 프로그램을 자동으로 진행합니다. 이 과정은 반복문이 끝날 때까지 계속되기 때문에, 다음과 같은 비슷한 출력을 얻을 수 있습니다.

(gdb) continue
l is 3
l is -2
l is 2
l is 1
l is -3

앞에서 j < k일 때, l은 항상 양수여야 한다고 말했습니다. 위 결과를 보고 우리는 l 값이 때때로 잘못된다는 것을 쉽게 알 수 있습니다.

commands에 쓸 수 있는 GDB 명령어 형태는 다음 기회에...

가끔 next나 step으로 실행 과정을 따라 가다가 반복문을 만날 경우, 반복문 끝난 부분으로 바로 건너뛰거나, 현재 함수의 실행을 정상적으로 끝내고 상위 함수로 돌아가야할 경우가 있습니다. 예를 들어:

for (i = 0; i < 1000; i++) {
/* do something #1 */
/* do something #2 */
}
/* do something #3 */

현재 "/* do something #2 */" 부분까지 실행했고, 이 반복문에 이상이 없다고 판단되면, 반복문 다음까지 빠르게 진행하고 싶을 겁니다. 이 경우, until 명령이나 advance 명령을 쓰면 편리합니다.

until 명령을 쓰면, 반복문이 아닌 경우에는 next 명령과 똑같이 동작합니다.

(gdb) until

반복문일 경우, 현재 스택 프레임 (즉, 현재 함수) 안에서, 현재 줄 다음 줄에 올 때까지 프로그램을 실행합니다. 쉬운 말로, 루프를 진행하고 빠져 나오는 순간까지 실행한 다음 "(gdb)" 프롬프트를 보여줍니다.

advance 명령은 continue 명령과 마찬가지로 프로그램을 주욱 실행시키는 대신, 지정한 곳에 코드 흐름이 오면 바로 멈춥니다. 예를 들어 위 코드의 "/* do something #3 */" 부분의 줄 번호가 34였다면, until 명령 대신 다음과 같이 실행할 수도 있습니다:

(gdb) advance 34

advance 명령은 스택 프레임에 대한 제한이 없기 때문에, 현재 함수가 아닌, 아무 곳이나 설정할 수 있으며, 위치 지정은 줄 번호 뿐만 아니라, break 명령에 쓰는 모든 형식을 다 지원합니다.


네트워크로 서비스 요청 데이터를 전송받아 분석하고, 적절한 기능을 수행하고, 그 결과를 돌려주는 서버 프로그램을 생각해 봅시다. 그리고 다음과 같은 꼴로 되어 있다고 가정해 봅시다:

#define PACKET_MAX      10

int
fetch(void)
{
int packet_received = 0;
int received[PACKET_MAX];

while (1) {
if (!packet_received) {
if (recv_data(received, PACKET_MAX) == 0)
packet_received = 1;
}

/* do work here */

process_packet(received, PACKET_MAX);
}
return 0;
}

이 프로그램은 평소에는 정상적으로 잘 동작하지만, 특정 패킷을 받으면 이상하게 동작한다고 가정합시다. 그리고 이 패킷은 아주 가끔 들어온다고 가정해 봅시다. 원하는 대로 패킷을 보내주는 프로그램을 따로 작성해 두지 않았다면, 이 프로그램을 디버깅하기 위해서, 문제를 일으키는 패킷이 올 때까지 하염없이 기다려야할 지도 모릅니다. 실제 코드는 다음과 같습니다:

만약 원하는 패킷이 recv_data()를 통해 들어왔다고 가정합시다. 이 때 packet_received는 1이 되고, 그에 따라 처리 작업이 이상하게 동작할 것입니다. 이 때, received의 내용을 저장하기 위해, 다음 명령을 쓸 수 있습니다:

(gdb) dump binary value buggy.dat received

위 명령을 수행하면 배열 received의 내용을 파일 buggy.dat에 저장합니다. 만약 시작 주소와 끝 주소를 알고 있다면 다음 명령을 쓸 수 있습니다:

dump binary data buggy.dat START-ADDR END-ADDR

이 때, START-ADDR는 시작 주소를, END-ADDR는 끝 주소를 나타냅니다. 즉, 앞 received 배열의 경우, 다음과 같이 쓸 수 있습니다.

(gdb) dump binary memory buggy.dat received received+10

어느 방법을 썼든지, 현재 디렉토리에는 buggy.dat이라는 파일로, 배열 received의 내용이 저장될 것입니다. 이는 메모리 내용을 그대로 dump시킨 것이므로 od(1)와 같은 툴을 써서 그 내용을 직접 볼 수 있습니다. received 배열은 int 배열이므로 다음과 같이 확인 가능합니다:

$ od -td buggy.dat 
0000000 163 151 162 85
0000020 83 190 241 252
0000040 249 121
0000050
$ _

만약, 바로 디버깅을 성공적으로 끝냈다면, 사실 위와 같은 기능은 큰 역할을 발휘하지 못합니다. 하지만, 계속해서 디버거를 실행해서 여러번 디버깅을 해야 한다면 꽤 쓸모있다는 것을 알 수 있습니다.

일단, 새로 GDB를 띄워 디버깅을 시작했다고 합시다.

    if (!packet_received) {

위 코드를 실행할 때, 강제로 packet_received를 1로 만들어, 패킷을 받는 부분을 건너뜁니다. 변수의 값 변경은 print 명령으로 쉽게 할 수 있습니다:

(gdb) p packet_received = 1

그리고 나서, received 배열을 아까 저장해 두었던 buggy.dat에서 다음과 같이
불러올 수 있습니다:

(gdb) restore buggy.dat binary received
Restoring binary file buggy.dat into memory (0xbfeda890 to 0xbfeda8b8)

이 외에도, GDB는 타 디버거에 비해 강력한 기능들을 많이 제공합니다. 다음 기회에 좀 더 알아보겠습니다.


'Computer_language > Debug' 카테고리의 다른 글

Pedro Vale Estrela - NS2 Debugging Page  (0) 2009.01.12
[From NS-User]  (0) 2009.01.12
NS2 Programming  (0) 2009.01.12
ns2 gdb debug 관련 파일  (0) 2009.01.12
Pedro Vale Estrela - NS2 Debug / BugFix Tutorial (OTCL + C++)  (0) 2009.01.12
And
prev | 1 | 2 | next