undefined | Better HN

0 pointsbrianwawok10y ago0 comments

How do you do that? I get on bootup you could do a little diddy, but how would you know if random bits are getting flipped? Seems tricky for an embedded device...

0 comments

9 comments · 2 top-level

gerbilly10y ago· 6 in thread

Not quite for memory _corruption_ but back when I was writing API code in C, I would place 'sentinels' at each end of my structs.

  struct somestruct {
    int s1;
    int data;
    char * moreData;
    int s2;
  }

When the caller of the API needed to call my code, it had to first call a function to get an instance of the struct. This constructor like code would allocate the memory for the struct, and then set s1 and s2 to 0xDEADBEEF;

The user would then fill out the rest of the struct and pass it back in as an argument to another call.

If either s1 or s2 wasn't 0xDEADBEEF, I would throw an error to the caller.

I helped me catch a lot of cases where the caller to the API had overrun some string while filling out the inputs.

Negitivefrags10y ago

This reminds me of something a friend of mine did once.

He had a structure that was getting overwritten with garbage due to an overrun somewhere else in the code. Rather than debugging and trying to find out what was doing it he just put "char temp[1000];" at the top of the struct to "absorb the damage".

I believe it's still running like that in production to this day.

gerbilly10y ago

> Absorb the damage

That's funny.

The code above got written that way because at my first job, I inherited a godawful business charting API written by the lead developer.

The input to the API was a struct with 70-80 members that the caller had to fill in and there were no defaults for anything! Naturally there were not just scalars, but lots of arrays and strings in the struct, which could easily be overrun or often left null.

The users, quite understandably, didn't fill out everything, which led to frequent crashes in _my_ code because that's where the pointers would get dereferenced.

When they would see that the crash was not in their code, the users of the API would punt the error to me even though it was their bad input that caused the problem. This would happen 10-12 times a day.

I rewrote the entire thing in a paranoid style , employing the trick above and others to try and ensure that if there was bad input, that it would always crash on their side of the fence.

After I was done I got one legitimate bug report for the code, even though it was in use worldwide in our medium sized company.

brianwawokOP10y ago

This is kind of an extension of the "throw more hardware at poor performance"... but its throwing more bytes at bad code ;)

mayank10y ago

Neat trick. Add a 'crc' field after 's2' and you just made it work for memory corruption too.

chopin10y ago

This one is compelling.

However this might not have caught the error condition described upthread. That condition might have overwritten data or moreData without touching s1 or s2.

Otherwise, great!

onetimePete10y ago

Reminds me of a stack canarys, stuff you put on the feet off a stack and check with the scheduler.

Also to all those ready to do a checksum on a struct, rememember that structs are plattform dependant (padding bytes).

agoetz10y ago· 1 in thread

SW Solution:

1. Store embedded system state in data structure.

2. Calculate a checksum for that data structure.

3. Verify that checksum is correct.

HW Solution:

Lockstep Execution/ECC memory, etc.

julie110y ago

ECC + checksum have a slight flaw.

If too many errors happens, the checksum can be correct even though the content is corrupted.

Hum... I know what you think: ThisShouldNeverHappen

When exploited by human it is called a collision attack. Works pretty well, so many people trust but never check.

j / k navigate · click thread line to collapse