Vida CEM swapping

Post by **vtl** » 18 Feb 2022, 08:52

Sh4rp wrote: ↑18 Feb 2022, 08:33 With the code change sirloins proposed I was able to crack my 8690720 CEM.

I have a CEM MCU frequency alignment code in my private branch, it increases sensitivity even for L-shaped P2. I'll polish the code and post it later for testing.

Unfortunately, brick-shaped P2 is still not cracking.

sirloins · Post by **sirloins** » 18 Feb 2022, 09:04

So my updated change of just changing this line:

Code: Select all

limit = TSC + 2 * 1000 * clockCyclesPerMicrosecond();

to this line:

Code: Select all

limit = TSC + 2 * 1000 * clockCyclesPerMicrosecond() / 4;

The resulting difference with dump buckets on just trying PIN 05 (all others are similar)

Original Code:

Code: Select all

[ 05 -- -- -- -- -- ]:     1   442     0   175     0    22     0     2     0     0     0     0     0     0     0     0     0     0     0     0 : latency     139384; std 40.36
Average latency: 147
  73 :     2
 108 :     1
 126 :     1
 132 :     1
 134 :     8
 136 :    69
 137 :     1
 138 :   275
 139 :     1
 140 :   442
 142 :   175
 144 :    22
 146 :     2

With my Change:

Code: Select all

 [ 05 -- -- -- -- -- ]:    37     0    31     0    31     0    38     0    26     0    29     0    31     0    32     0    33     0    35     0 : latency     140313; std 12.29
Average latency: 138
 106 :    17
 108 :    31
 110 :    26
 111 :     1
 112 :    28
 114 :    23
 116 :     5
 118 :    37
 120 :    33
 122 :    35
 124 :    17
 126 :    35
 128 :    29
 130 :    37
 132 :    31
 134 :    31
 136 :    38
 138 :    26
 140 :    29
 142 :    31
 144 :    32
 146 :    33
 148 :    35
 150 :    19
 152 :    28
 154 :    16
 156 :    51
 158 :    37
 160 :    34
 162 :    29
 164 :    27
 166 :    26
 168 :    23
 170 :    23
 172 :    29
 174 :    18

So reducing the limit by 4x from 2ms to 0.5ms gives more variation in the returned latencies, and also some higher ones? The only thing I can think of is that without my change we hit the CAN Interrupt before the limit, and with my change, we hit the limit first? What do you guys think?

Edit: I confirmed that with the default code, it hits the CAN interrupt first before the TSC limit, every time. With my change, it is hitting the TSC limit every time.

Full logs where I took the above data from are attached.

Post by **vtl** » 18 Feb 2022, 09:50

sirloins wrote: ↑18 Feb 2022, 09:04 So reducing the limit by 4x from 2ms to 0.5ms gives more variation in the returned latencies, and also some higher ones? The only thing I can think of is that without my change we hit the CAN Interrupt before the limit, and with my change, we hit the limit first? What do you guys think?

Edit: I confirmed that with the default code, it hits the CAN interrupt first before the TSC limit, every time. With my change, it is hitting the TSC limit every time.

Full logs where I took the above data from are attached.

Yes, the limit is the total timeout for CAN reply, while the body of the function measures the longest pause on CAN bus. Because, unlike the synchronous or almost synchronous MCP2515 design, the Teensy code is not in control of when the CAN request will be sent out, reducing the limit to 500 us (which is more than enough for all known CEMs to reply) may/will affect the measurement, since a good chunk of time will be spent waiting for TX to complete.

Actually, we've introduced one more undesired "bug" in schematic: pin 2 is now attached to RX lane of CAN transceiver, which might not see anything on the bus while TX is active. The original design was seeing both egress and ingress CAN frames. The fear was that Teensy is only 3.3v tolerant, while CAN voltages can be higher. But in real life Volvo has only 3.3v CAN buses, so that design change might have a negative impact on measurement. Or not

Out of curiosity, I connected pin 2 to CAN-HS L line and it still worked with P2 CEM-L.

Post by **vtl** » 18 Feb 2022, 09:52

Everyone is welcome to test the new code: https://github.com/vtl/volvo-cem-cracke ... freq_align It is based on sirloins' discovery that the crack chance is higher with the latency resolution aligned to the CEM MCU's clock frequency.

sirloins · Post by **sirloins** » 18 Feb 2022, 10:52

I pulled the changes, but it did not work. I will run it with more logging but I am still trying to better understand why my change works. If I take your change, and just add the "limit = TSC + 2 * 1000 * clockCyclesPerMicrosecond() / 4;" then it works...

As I mentioned, with my change. We exit out of that while loop in cemUnlock from the TSC > limit instead of the CAN interrupt. After the loop it then calls canMsgReceive with a 1000ms timeout. Since the interrupt hasn't triggered, there are no messages available and it hits the delay(1) within canMsgReceive (1ms delay). This is also why my change makes the pin/sec much slower than normal.

Okay, so I thought to speed things up, why not just wait for the can interrupt after we break out of the while loop in cemUnlock, this way we avoid the 1ms delay. Well then my code also doesn't work, latencies look almost the same as when I did not have the TSC limit reduced.

So why does that 1ms delay matter? Perhaps it has to do with the CAN controller sending messages asynchronously as you mentioned... but If I take the stock code and just add a 1ms delay before the canMsgReceive call it still doesn't work. So that leads me to believe there is something going on with the CAN_L_PIN but before we get the CAN interrupt.... but that doesn't explain why when I wait for the CAN interrupt _after_ we have already finalized maxTime (latency) it still doesn't work... Confusing.

So breaking out of the while loop before the CAN interrupt seems to be important, but also having a little bit of a delay between PIN checks...

Attached are partial logs of your branch, with and without the divide by 4.

Post by **vtl** » 18 Feb 2022, 11:00

Can you try rerouting pin 2 to CAN differential L lane? From dashed to red.

sirloins · Post by **sirloins** » 18 Feb 2022, 11:12

Yes, I can try that change. Will take a few minutes as I soldered the board together in order to test in my car last night. I'll add jumpers to pick CANL or CRX.

Just to make it a bit clearer, I am confused as to why this does not work:

limit = TSC + 2 * 1000 * clockCyclesPerMicrosecond() / 4; // TSC Limit now 0.5ms
...
while (!intr && TSC < limit) .....
...
// We exit out of the while loop due to TSC now > limit

// Wait for CAN interrupt so we don't hit the delay(1) in canMsgReceive()
do{}while(!intr);

/* see if anything came back from the CEM */
canMsgReceive(CAN_HS, &id, reply, 1000, verbose);

// Add 1ms delay to simulate the canMsgReceive having to wait
delay(1);

It doesn't make sense to me.. if remove the "do{}while(!intr);" and the "delay(1)" it works... but not if I wait for the interrupt... maxTime should no longer be affected once we are out of that while loop.

Post by **vtl** » 18 Feb 2022, 11:14

sirloins wrote: ↑18 Feb 2022, 10:52 I pulled the changes, but it did not work. I will run it with more logging but I am still trying to better understand why my change works. If I take your change, and just add the "limit = TSC + 2 * 1000 * clockCyclesPerMicrosecond() / 4;" then it works...

As I mentioned, with my change. We exit out of that while loop in cemUnlock from the TSC > limit instead of the CAN interrupt. After the loop it then calls canMsgReceive with a 1000ms timeout. Since the interrupt hasn't triggered, there are no messages available and it hits the delay(1) within canMsgReceive (1ms delay). This is also why my change makes the pin/sec much slower than normal.

Try this?

Code: Select all

diff --git a/volvo-cem-cracker.ino b/volvo-cem-cracker.ino
index d0e9129..25b182c 100644
--- a/volvo-cem-cracker.ino
+++ b/volvo-cem-cracker.ino
@@ -368,6 +368,8 @@ bool cemUnlock (uint8_t *pin, uint8_t *pinUsed, uint32_t *latency, bool verbose)
 
   /* maximum time to collect our samples */
 
+  delay(1);
+
   limit = TSC + 2 * 1000 * clockCyclesPerMicrosecond();
   intr = false;

I've got a somewhat similar report for earlier P3 CEM's: punching them at full speed eventually "kills" them. Adding 2 or ms delay between crack attempts solves the issue.

Looks like some CEMs need a quiet time for background work.

Post by **vtl** » 18 Feb 2022, 12:05

sirloins wrote: ↑18 Feb 2022, 10:52 Confusing.

What I did during my initial investigation year+ ago is toggling IO pin when something important happens, like timeout or detection of the longer reply latency, and collecting this pin with logic analyzer, along with CAN lanes. I have a cheap $10 knock-off of Saleae Logic. Then it was clear what is happening on the bus and how the sw reacts to the bus events.

sirloins · Post by **sirloins** » 18 Feb 2022, 13:15

I do have something like that I can use.

I tried the delayMicroseconds, no luck.

Another piece of information, disabling interrupts on your branch makes it work:

Code: Select all

@@ -375,6 +375,7 @@ bool cemUnlock (uint8_t *pin, uint8_t *pinUsed, uint32_t *latency, bool verbose)
   canMsgSend (CAN_HS, 0xffffe, unlockMsg, verbose);
 
   start = end = TSC;
+  cli();
   while (!intr && TSC < limit) {
     /* if the line is high, the CAN bus is either idle or transmitting a bit */
 
@@ -393,6 +394,7 @@ bool cemUnlock (uint8_t *pin, uint8_t *pinUsed, uint32_t *latency, bool verbose)
 
     start = end;
   }
+  sei();

It does slow down the process a bit (no more intr to break the while loop), but reliably found the first byte after the first pass. I think the only item triggering interrupts would be the two can controllers?

Also, the latency returned is much more varied, just like when I limited the TSC timeout.

Vida CEM swapping

Re: Vida CEM swapping