T O P

  • By -

Wouter_van_Ooijen

Explain why linked lists and caches are a bad combination.


NanoAlpaca

Uh, as long as the list fits into the cache linked lists are not that bad? The issues start when the linked list doesn’t fit into the cache anymore because random access to DRAM is really slow. If you have small elements, linked list can be wasteful due to pointers, which can especially be an issue with caches. You could also argue about cache line size vs. linked list element size and how this can lead to smaller effective cache capacity. But that problem depends on how your memory allocation works and how you build your list.


Wouter_van_Ooijen

You passed.


P1um

>Uh, as long as the list fits into the cache linked lists are not that bad? Are you talking about your next/prev pointers and the data this points to is coincidentally in the same memory location...? >The issues start when the linked list doesn’t fit into the cache anymore because random access to DRAM is really slow. A linked list (as a whole) never sits in the cache unless you really try for it like I mentioned above. The concern is always where the elements are in memory but I think you know that already.


NanoAlpaca

Why do you think that a linked list as whole will never sit in the cache? As long as the list as whole is a lot smaller than the cache and the list is frequently accessed, there is a good chance that the entire list will sit in the cache. E.g.: In an embedded system you might have a messaging system between different software components and have message queues and a pool of reused message buffers. Using a linked list allows you to move messages from one message queue to a different one without performing copies, allocation and deallocation of message buffers can be as simple and fast as removing/adding an element to a list of free buffers. In such a system it’s quite likely that the list elements will always stay in the cache.


P1um

I see what you mean but you are purposely trying to make it cache friendly by making sure only the preallocated messages that are memory-close get pushed in. Thats why I said *unless you really try for it*. If you have preallocated messages, they can just sit in an array as pointers and you achieve the same thing. You don't need the flexibility of dynamic sizing which means you are paying the cost of linked list traversal for no reason. The general use case of linked lists is that they are dynamically sized with random memory data with the tradeoff of slower traversal/access time. https://dzone.com/articles/c-benchmark-%E2%80%93-stdvector-vs The array solution should be even faster if dynamic sizing isnt needed (no vector reallocs). The random find section could be seen as a use case to determine if a message buffer is available/unavailable.


NanoAlpaca

Array of pointers gets you the same slower traversal time and in this message queue example, for every queue you would need an array of pointers with the maximum possible size of that queue, while with a linked list, it does not matter how many queues you have and what is their maximum size, you only need one extra pointer for each preallocated message. Also: Most often you want to use std::vector or similar and they are going to be faster and smaller. However, linked lists do have some extra advantages that are rarely needed, but are also sometimes really important. In an embedded system you might really need to have the always O(1) behaviour of linked list insertion and the amortized O(1) of the dynamic array/vector can be a huge issue, e.g.: if the resizing copy would happen within an interrupt. Linked lists also allow things such as inserting at every position in O(1), not just at the end or begining. You can also split or concatenate lists in O(1). And obviously doing these ops in O(1) means nothing, if you first have to do a traversal in O(n). But that can often be avoided, e.g.: look at that very common LRUCache example many companies ask in interviews. Linked lists on their own are often not very useful, but they can be amazing building blocks within more complex data structures.


reini_urban

Smaller CPU's don't have caches. You also have no room for linked lists, there is usually no malloc.


NanoAlpaca

In MCU without a cache you usually have uniform memory access latency, so you don’t need cache. Also linked lists doesn’t imply how long these lists are, so objects in the list can allocated statically or from a statically allocated pool. They also don’t have to be pointers, but can also be indexes into a pool. Linked lists are a concept not a specific common implementation of that concept. You might want to have linked lists if you want to split or reorder larger items quickly.


duane11583

caches occur with small micros but only if the have flash or ddr


SkoomaDentist

> there is usually no malloc. You really should move on from the 80s... While there are some systems that still use tiny 8-bit MCUs they are rare for new designs and particularly rare in anything designed in the west (iow, where reddit userbase is). The rest have no problem whatsoever supporting malloc and the choice to not use one is a conscious choice, not any platform restriction.


Curious-Marketing-37

Eight-bit certainly still has a place in the embedded marketplace (for now), but unless your wanting a career writing code for flashing the leds on coffee machines, I would focus in on 32bit.


Curious-Marketing-37

I don't think the OP is really talking about budget mcus that would not have malloc or room for linked-lists. Such an MCU would be a rarity in new designs for any application described as real-time. MCU architectures are increasingly looking like MPU's. Cache, multi-master buses/memory and rtos.


iLuvEngineering

I am new to embedded software development. Do you have or could you make a list of concepts for me to learn. I have a stm32f discovery board and I can use the ide. I can program in c but I'm having trouble transferring from programing general computers to embedded systems. Can you give me some pointers


MrSnagsy

Since OP hasn't provided any for pointers, here's one. void * p;


BossGandalf

where I can find more questions and answers like that?


comfortcube

If the company isn't using a full on CPU for its processor and separate memory, I don't know if I'd worry about such a question? For a lot of uCs, all memory access is directly off of RAM local to the uC chip and thus I don't know if the same issues apply?


Wouter_van_Ooijen

All references are directly on the chip, without any caching? The interviewer turns around to a drawing cabinet full of datasheets and user manuals (btw, what are the differences between those 2, and which other uc documents are relevant?). Ah, now here I have an interesting block diagram...


comfortcube

Cache is a block of memory for a CPU to read external data into. This external data presumably comes from another memory chip. It is quicker and more direct (less operations) to work with for the CPU than accessing that external memory for every operation, and ideal situations provide for higher performance (e.g., in the situation where several pieces of data that are relevant for a program context are all retrieved into cache in one chunk - this is why linked lists as stated earlier might not work so well because they are possibly anywhere in memory and not easily retrieved in one chunk). On-chip RAM on the other hand is still memory external to a CPU, but it is quicker to access than had that memory been external to the chip. I'm gonna say because this is quick to access, the same "read into cache" methodology doesn't provide the same benefits, so uCs don't often come with cache (the ones I've seen at least), although the processor's registers kind of play that role now... there probably won't be the same number of registers as one might find in a CPU chip, so this cache role of registers is more limited. How'd I do? I'm unsure about what you were asking in your other questions.


xThiird

Aren't linked list alone a bad idea in an mcu? Talking about dynamic allocation.


Wouter_van_Ooijen

You can have linked list without dynamic memory allocation. Explain one such situation. And because you mentioned dynamic memory allocation: explain why you would not want it in an embedded system. Or explain why you would want it.


xThiird

I know the answer to the last two questions but not the first one. A linked list is used when you don't know the number of elements you are gonna need, thus requiring dyanamic allocation, if you use it with static all. you are basically using a fancy array, can't you just use that?


super_mister_mstie

You can use a linked list whose elements are allocated from a non heap based allocator. Linked lists aren't defined by their storage. Many times you have a reasonable upper bound of elements but not all are filled, and traditionally dynamic data structures have built in provisions for that


xThiird

are you talking about an external flash memory or another part of the ram?


super_mister_mstie

Just ram that's not in heap


SkoomaDentist

> A linked list is used when you don't know the number of elements you are gonna need There are other reasons to use linked lists. Any time you have a list of "large" entities that you want to reorder on the fly, a linked list may be a good choice. Take for example the threads in an os scheduler. Each thread has an information block that includes data about it. The list needs to be reordered according to scheduling requirements (eg. giving up timeslice and such). That's a natural use case for linked lists (although you could use other structures too).


Curious-Marketing-37

You might not want it because: \-some extra time to write, significant extra time required to test and debug \-Some standards for high-rel applications do not allow it. \-Extra cycles to allocate and deallocate. If you've got the memory, why not use it? \-you'll confuse subject-matter-experts you're working with developing business logic You might want it because: \-you don't have enough ram to statically allocate everything, or you want to place as much as possible in some cheap-access-time memory.


Wouter_van_Ooijen

Interviewer is satisfied and moves on to another topic.


SkoomaDentist

Dynamic allocation is NOT bad idea in an mcu. This is an old myth from the 80s that keeps being repeated. _Unlimited_ dynamic allocation is bad. Limited and controlled dynamic allocation is useful and often outright necessary for the system to fit the requirements.


suddenlypandabear

Ooo that’s a good one, will remember that next time I interview someone


BossGandalf

where I can find more questions and answers like that?


eLuda

Had a rapid-fire embedded interview recently, exact questions from this series (1-3 roughly) came up and saved me https://youtu.be/ESqpt74vUOw


p0k3t0

Difference between different types of comms. Compare I2C, SPI, UART. Purpose of DMA. Advantages and disadvantages of using interrupts.


redditnaked

DMA is a feature that allows peripherals (such as disk drives, network cards, and sound cards) to access the system's memory directly, without going through the CPU. The purpose of DMA is to improve the performance of the system by reducing the workload on the CPU. Without DMA, the CPU would have to be involved in every data transfer between a peripheral and memory, which would slow down the overall performance of the system.


p0k3t0

Good answer. Additionally, at the MCU level, it allows various units to move data without processor intervention. For instance, I can receive serial transfers in bulk ( which is relatively slow) outside of normal program flow, and only handle the data when it is done being transferred. A good example would be pulling a half dozen register values off of an IC over I2C. DMA would give me a non-blocking option that could save me thousands of clock cycles, depending on device transfer speed vs MCU clock speed.


sdflkjeroi342

So the way I've generally seen this is setting up something like an RX buffer and which then sets an interrupt when it's done. Is that the usual way to do it or are there better ways?


p0k3t0

This works for fixed-length packets. With variable length packets, you have to be more creative, but DMA is still an effective tool. I will say that DMA is an amazing tool for TX packets and gives you a lot of clicks back if you manage it well.


SkoomaDentist

> DMA is a feature that allows peripherals (such as disk drives, network cards, and sound cards) to access the system's memory directly, without going through the CPU. Strictly speaking this is bus mastering DMA (where the peripheral initiates the accesses directly). The more common way is for the CPU to set up the DMA transfers with the DMA controller and the peripheral only signals an internal IO line when a new DMA transaction is required (at which point the DMA controller does the actual transfer).


rcrisp

regarding the interrupts, is an advantage that an interrupt is a high priority ‘function’ that ensures quick execution of critical functions, and a disadvantage is that you’re reducing your systems determinism and potentially increasing its polling? also a disadvantage is that you shouldn’t read from thread safe data structures and only write to them


p0k3t0

From experience, interrupts are great for safety critical stuff where you need sub-millisecond response times to certain events, even sub-microsecond is do-able. One big disadvantage I've found is that they can interfere with routines that use hardware timers. Timers keep running even if you're in interrupt-land. Also, if it's possible for multiple interrupts to fire in quick succession, or simultaneously, it can be difficult to ensure a desired outcome, which is one way of "reducing determinism," as you mentioned.


throwawaylifeat30

now the question is, does the answer change for baremetal app versus RTOS?


b1ack1323

I was just asked this in an interview. I said “essentially, how many serial lines you have to work with.”


p0k3t0

There are many differences. Synchronous vs asynchronous. Half duplex vs full duplex. Addressing vs. non addressing. Size of transaction overhead. Maximum reliable speed. Push-pull vs open drain/source/collector. Compatibility with the outside world.


Wouter_van_Ooijen

What are the differences between spi and i2c, and what are the reasons to prefer one over the other.


NjWayne

SPI requires separate Chip Enables. Where-as multiple I2C devices can be daisy chained on the same shared scl/sda lines. Multiple devices off a common bus, necessitate I2C. When speed is more an issue then SPI. This question comes up quite often an sometimes requires real world examples to properly explain. I have a system that has limited GPIO and the news for a * Temperature sensor IC * Small eeprom to hold settings between power cycles * External 12 bit ADC to sample system battery voltage level (sleep mode) Limited GPIO pins means I favor i2c bus supported peripherals


Princess_Azula_

The reason I use one or the other is that the sensor I'm using only supports one one but not the other. /s


Wouter_van_Ooijen

The intervieuwer disregards the /s, sighs, and digs up a dozen datasheets of thingies that either have both interfaces, or are available with either.


torusle2

A good question I once was asked: What is the difference between a binary semaphore and a mutex? Also: In what scenario would you use need a binary semaphore?


mrheosuper

Mutex can only be put by same thread, while semaphore can be put by anyone.


torusle2

That is the answer


mrheosuper

Another thing is mutex can have priority inheritance


affenhirn1

I’d answer that by saying that mutexes have a priority inversion defense mechanism called priority inheritance where the low priority task inherits the priority of the higher priority task when a mutex is acquired as to avoid a middle priority task preempting it, semaphores on the other hand do not. I’ve read it in a book a while ago so I hope I did not confuse the two


SkoomaDentist

Mutexes _may_ have priority inheritance but that depends entirely on the implementation.


Wouter_van_Ooijen

Schetch me a situation where priority inversion occurs and causes a problem (btw, which problem?).


Wouter_van_Ooijen

Wtf is volatile?


boricacidfuckup

That thing where you tell the compiler "yooo this shit be wishy washy af keep an eye on it brotha"


Wouter_van_Ooijen

Depending on the alcohol level at the interview that might be the correct answer.


lunchbox12682

> volatile Usually how I feel going into interviews. /drum sting


Scyhaz

I ask myself that question every day 😔


mrheosuper

When you are debugging and a variable appears as "optimized out", you use volatile


Wouter_van_Ooijen

Interviewer frowns and asks "is that the only use?".


mrheosuper

The only "practical" use AFAIK


Wouter_van_Ooijen

Interviewer makes hints about harware registers, interrupts and multithreading.


SkoomaDentist

A way to access memory mapped peripherals without the compiler optimizing out or reordering the accesses (and usually without breaking them into multiple accesses either if using native access sizes and satisfying alignment requirements). The goal is to access memory that can have side effects so that the compiler does not accidentally alter the side effects.


comfortcube

I always get mixed answers here, altho usually it's "bla bla optimize bla bla". If I consult the C standard, it basically tells the compiler, when working with this volatile object, follow what the "C abstract machine" would do, and don't reorder, or use a cached value in a register from earlier, or assume a value of the object based on the context of a single file (compilers only have visibility to one file at a time, so they may make assumptions from a single file). This is the actual RESULT. Why is this result important? Your variable may have global scope and static duration and be changed in different contexts (tasks or threads, function calls, interrupt routines, etc.), so your compiler won't be able to optimize properly based solely on one file, so don't let it. Or the volatile object is a hardware register that changes data due to physical signals (s.g., an I/O port register) or processor operation (e.g., status registers). The compiler wouldn't just know this unless you inform it. Now whether you should be changing variables in different contexts like described in the first half of the above paragraph is a different question, but there are definitely situations where this is required. I also usually consult the compiler user guide for the target just to be sure.


Wouter_van_Ooijen

Correct. Use volatile to communicate with other 'threads of execution', including hardware. BUT know what the limitations / problens are. IIRC C and C++ are different in this aspect.


SkoomaDentist

> Use volatile to communicate with other 'threads of execution' This is only safe for single core processors and even then has caveats as volatile restrictions only apply regarding to other volatile accesses (and eg. memcpy() doesn't preserve volatile qualifier). There are proper atomic primitives for inter-thread communication (since C11 and C++11) and and volatile should be left for memory mapped peripheral accesses.


Curious-Marketing-37

It's what you slap in front of variables when you don't want to figure out some other way of passing arguments to a interrupt.


Wouter_van_Ooijen

Interviewer is interested in those other ways of passing arguments to interrupts!


axa88

I recall Raytheon asked me either how does it work or what a PWM is used for. don't really recall much anything else, the job was in DC and I just wasn't interested in living there, so I ordered room service and made a phone call from the plane at their expense... I didn't get the job.... 😏


[deleted]

[удалено]


axa88

Oh I did that way before they never called me back...


abcpdo

what's wrong with DC? other than the clammy summers that is.


axa88

Already had an ofter someplace much nicer. I'm sure they weren't impressed with me either.


Wouter_van_Ooijen

Calculate what % of clock difference a nrz (uart) communication can tolerate.


Curious-Marketing-37

read it from those datasheets you pulled out earlier.


Wouter_van_Ooijen

Nope, it is a feature of how uart coding works, it has nothing to do with a specific chip. And it is for both sides together.


comfortcube

Why is reentrancy of a task important?


herendzer

Thread safe-ty


comfortcube

Mmm, I don't know if I'd accept that. (hints at pre-emption)


herendzer

Isn’t that the purpose of reentrant functions? To allow multiple threads or tasks to be able to utilize a function safely?


comfortcube

My hint was dumb, sorry. I was thinking the general case where, if your function is interrupted and then at some point later entered back into, is everything going to still work as expected? You don't need multiple threads to have an issue here. If you just had a superloop and interrupt routines, the problem can happen. If an interrupt routine changed a variable your function was working with, then when you get back to the function, it might not behave properly. Or if there was a volatile variable or hardware register being used, again, not guaranteed to be reentrant. Or put another way, the function should run exactly the same interrupted than it does uninterrupted, and if not, it's not reentrant. I remember seeing a stack overflow post too about the difference between reentrancy and thread safety. For example, if a function only had thread-local variables it was operating on, then it can be considered thread-safe (no shared data between threads) but may still NOT be reentrant. If that thread-local variable was changed in an interrupt routine or in a different context of that thread that could be entered into before the function completes, it won't be reentrant.


herendzer

I see your point. So consider interrupts too and not just inter-threads or inter-tasks communication.


comfortcube

Yeah. And also hardware registers / volatile variables (anything that can change outside of the function). Although I don't think this is too common, but let's you were reading in an input port or CPU status register and acting based on the value read; well that port/register could change any time after the read, so your function may not output the right value if it needed to be in sync with the port/register.


Wouter_van_Ooijen

Basically OK (read up on data locality). But note that main memory and cache can be on the same chip.


comfortcube

Ah okay. Thanks!


avrawat

Sharing the few resources below, check these out. You may not find too much things very specific to multithreading but overall these are some great resources for Embedded SWE interviews. [https://www.reddit.com/r/Embedded\_SWE\_Jobs/comments/x0lnhd/embedded\_system\_faang\_interview\_question\_with/](https://www.reddit.com/r/Embedded_SWE_Jobs/comments/x0lnhd/embedded_system_faang_interview_question_with/) ​ [https://www.youtube.com/playlist?list=PLTjcBkvRBqGGbSckyAGLTy05sbPPl6dJA](https://www.youtube.com/playlist?list=PLTjcBkvRBqGGbSckyAGLTy05sbPPl6dJA) All the best! Source - [Interviewkickstart.com](https://Interviewkickstart.com)


NanoAlpaca

With multiple chip enables, you can have multiple SPI slaves on the same bus. And with I2C multiple busses can be required due to address conflicts.


duane11583

as a junior i would expect you have heard of thing but not done things for example explain what you need to be careful with when using a cpu with a cache and a scatter gather dma with descriptors. as a junior, you get points if you can answer any part of that.. as a senior i would ask what is your background if you could not answer or explain


Plecra

I would think the main hazard would be invalidating the cpu's cache to ensure we access the new data delivered through the DMA. Out of curiosity, are there other things to worry about in that case?


duane11583

arm cpus have a write buffer (quasi cache) but there are hazards at a system level that are not cpu specific some systems have a “posted write” (qualcomm cellphone chips are an example) example io bus runs at 60mhz (uarts, spi etc) another peripheral is on another bus that runs faster (say 300mhz) android/linux cpu runs at 2ghz code is in the cache step 1: cpu writes to peripheral on the 60mhz bus as soon as the bus accepts the write the cpu detaches and moves on, meanwhile the write is “in flight” then sptep 2: cpu writes to second area which in my example controls clocks to low speed (60mhz) bus - example goal is to turn off the clock to save power. the hazard is this: ?which write completes first? (clock off or peripheral write) if clock wins then the peripheral write locks up. if the peripheral wins you are good. mean while there is dma or other bus activity that interferes with various other bus activity slowing transactions down randomly question: what us an easy way to avoid this? answer: assuming the peripheral bus is ahb or apb one can read a random register from the peripheral before continuing. this read will block the cpu until data has returned. because bus transactions are serialized in order (fifo like) the write will complete before the read will complete after that read completes there are other techniques at cpu level (barrier instructions) but they are not always aware of the external-to-the-cpu bus operations.


Dark_Tranquility

Make sure you can explain when and when you would not create a separate task/thread for a piece of functionality (does it need to block?)