Interview Questions - Operating System

Deadlock
Mutual Exclusion
Hold & Wait
No Pre-emption

Circular Wait
- Protected access to shared resources, which implies waiting.
- No resource preemption, meaning that the system cannot forcibly take a resource from a thread holding it.
- Circular dependency graph

Starvation - a scheduler process(OS) refuses to give a particular thread any quantity of a particular resource

Swapping occurs when whole process is transferred to disk
Paging occurs when some part of process is transferred to disk while rest is still in physical memory
- page fault

Ephemeral ports are port used by Operating system for client sockets.

Zombie processes
a zombie process or defunct process is a process that has completed execution but still has an entry in the process table.
- child processes whose parent process died without reaping its children
ps aux | grep Z

kill  -17
Orphan Process
process whose parent process has finished or terminated, though it remains running itself
- adopted by init process
- intentionally orphaned - nohup
orphaned unintentionally, such as when the parent process terminates or crashes

/var/log/messages
cat /proc/cpuinfo
cat /proc/sys/kernel/threads-max

stack vs heap
- each thread has its own stack, they all share the heap
- size of stack is lot lesser than heap
- variables stored in stacks are only visible to the owner Thread, while objects created in heap are visible to all thread.
- when no memory in stack, it throws StackOverFlowError, when no memory in heap, it throws OutOfMemoryError

Semaphore has no concept of an owner. Any process can unlock a semaphore.

Mutexes have a concept of an owner, which is the process that locked the mutex. Only the process that locked the mutex can unlock it.
Memory Mapped File
- allows you to directly read from memory and write into memory by using direct and non direct Byte buffers.
- operating system takes care of reading and  writing and even if your program crashed just after writing into memory. OS will take care of writing content to File
- shared memory, memory mapped files can be accessed by more than one process and can be act as shared memory with extremely low latency.
- mapping whole file or portion of file into memory and operating system takes care of loading page requested and writing into file while application only deals with memory which results in very fast IO operations
- Memory used to load Memory mapped file is outside of Java heap Space
- Memory Mapped Files are way faster than standard file access via normal IO.
- allows you to load potentially larger file
- file can be shared
- may lose data if os crashes or power failure
disadvantages
- increase number of page faults. Since operating system only loads a portion of file into memory if a page requested is not present in memory than it would result in page fault.


a port is a software construct
- uniquely identify different applications
- so they can share a single physical connection
- only 65536 ports

/etc/services

Pipes are unidirectional byte streams which connect the standard output from one process into the standard input of another process

- implemented by PipeFS which is a unique virtual filesystem

Main IPC(Inter-process communication) methods
File, Signal, Socket, Pipe, Named pipe, Shared memory, Message queue, Message passing, Memory-mapped file, Semaphore [thread synchronization mechanism].
Signal is an asynchronous notification sent to a process in order to notify it of an event that occurred.
When a signal is sent to a process, the operating system interrupts the process's normal flow of execution. If the process has previously registered a signal handler, that routine is executed. Otherwise the default signal handler is executed.
In Linux/Unix, a set of processes chained by their standard streams, so that the output of each process (stdout) feeds directly as input (stdin) to the next one. Each connection is implemented by an anonymous pipe.
a named pipe (also known as FIFO) is an extension to the traditional pipe.
A traditional pipe is "unnamed" because it exists anonymously and persists only for as long as the process is running. A named pipe is system-persistent and exists beyond the life of the process and must be deleted once it is no longer being used. Processes generally attach to the named pipe (usually appearing as a file) to perform inter-process communication (IPC).
mkfifo my_pipe
gzip -9 -c < my_pipe > out.gz
cat file > my_pipe
rm my_pipe
Shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies.
Since both processes can access the shared memory area like regular working memory, this is a very fast way of communication. On the other hand, it is less powerful, as for example the communicating processes must be running on the same machine, and care must be taken to avoid issues if processes sharing memory are running on separate CPUs and the underlying architecture is not cache coherent.
Message queues provide an asynchronous communications protocol, meaning that the sender and receiver of the message do not need to interact with the message queue at the same time. Messages placed onto the queue are stored until the recipient retrieves them.
Implementations include IBM's WebSphere MQ, Apache ActiveMQ and etc.
In this model, processesto can send and receive messages to other processes. By waiting for messages, processes can also synchronize.
Common message passing methods contain SOAP, Java RMI, Corba and etc.
A memory-mapped file is a segment of virtual memory which has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource.
Synchronization Mechanism
Common OS synchronization mechanisms include mutexes, semaphores, and monitors, readers/writer locks.
Mutual Exclusion (Mutex) Locks
Mutex semantics are "bracketed", the thread owning a mutex is responsible for releasing it.
There are two general types of mutexes:
- Nonrecursive mutex that will deadlock or otherwise fail if the thread currently owning the mutex tries to reacquire it without first releasing it and
- Recursive mutex that will allow the thread owning the mutex to acquire it multiple times without self-deadlock, as long as the thread ultimately releases the mutex the same number of times.
Semaphore Locks
A semaphore is conceptually a non-negative integer that can be incremented and decremented atomically. A thread blocks when it tries to decrement a semaphore whose value equals 0. A blocked thread will be released only after another thread "posts" the semaphore, causing its count to become greater than 0.
On semaphore, we can have two operations, denoted as V or also signal() that  increments the semaphore, and  P (or wait()) that decrements the semaphore.
Unlike mutexes, they need not be released by the same thread that acquired them initially.
Semaphores which allow an arbitrary resource count are called counting semaphores, whilst semaphores which are restricted to the values 0 and 1 (or locked/unlocked, unavailable/available) are called binary semaphores.
Monitor
A monitor is an object or module intended to be used safely by more than one thread.
The defining characteristic of a monitor is that its methods are executed with mutual exclusion. That is, at each point in time, at most one thread may be executing any of its methods.
A condition variable is a queue of threads, associated with a monitor, on which a thread may wait for some condition to become true. Thus each condition variable c is associated with an assertion Pc. While a thread is waiting on a condition variable, that thread is not considered to occupy the monitor, and so other threads may enter the monitor to change the monitor's state. In most types of monitors, these other threads may signal the condition variable c to indicate that assertion Pc is true in the current state.
It provides better performance than a mutex, readers/writer, or semaphore lock. As the latter three mechanisms make other threads wait while the thread holding the lock executes code in a critical section.
Readers/Writer Locks
A readers/writer lock allows access to a shared resource by either:
- Multiple threads reading a resource concurrently without modifying it or
- Only one thread at a time modifying the resource and excluding all other threads from both read and write access.
This type of lock can improve the performance of concurrent applications in which resources protected by a readers/writer lock are read from much more frequently than they are written to. Readers/writer locks can be implemented to give preference either to readers or to writers
The major difference between threads and processes includes:
1. A process is an execution of a program but a thread is a single execution sequence within the process. A process can contain multiple threads. A thread is sometimes called a lightweight process.
2. Process has its own memory space, runtime enivorment and process Id, Thread runs inside a Process and shares its resources with other threads.
3. Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent process.
4. Threads can directly communicate with other threads of its process; processes must use interprocess communication to communicate with sibling processes.
5. Threads have almost no overhead; processes have considerable overhead.
6. New threads are easily created; new processes require duplication of the parent process.
7. Threads can exercise considerable control over threads of the same process; processes can only exercise control over child processes.
8.Changes to the main thread (cancellation, priority change, etc.) may affect the behavior of the other threads of the process; changes to the parent process does not affect child processes.
ps:
A JVM runs in a single process, threads in a JVM share the heap belonging to that process. That is why several threads may access the same object. Threads share the heap and have their own stack space. This is how one thread’s invocation of a method and its local variables are kept thread safe from other threads. But the heap is not thread-safe and must be synchronized for thread safety.
An inode is a data structure on a Linux/Unix file system for any regular file, directory, or other file system object.
It stores meta-data about file such as file type, permissions, its owner, group, file Size, number of hard links, access control list and etc.
It does not contain file names, only file metadata.
UNIX directories are lists of association structures, each of which contains one filename and one inode number.

We can use ls -i or stat command to see inode number of file.
ls -i /etc/passwd, stat /etc/passwd
If you creates filename with control characters or characters which are unable to be input on a keyboard or special character such as ?, * ^ etc, how to remove it?
1. Find out file inode by ls -i or stat command with wildcard.
2. Use find command with -inum [inode-number] and "-exec rm" to remove file:
find . -inum [inode-number] -exec rm -i {} \;
find . -inum [inode-number] | xargs rm
Maximum number of processes in Linux?
Linux sets a limit on the total number of processes on the system, we can view this vlaue from /proc/sys/kernel/threads-max, or write a number to chnage the value.
In my ubuntu machine:
cat /proc/sys/kernel/threads-max
32619
Maximum number of threads per process in Linux?
Linux doesn't have a separate threads-per-process limit, just a limit on the total number of processes on the system.
Threads are essentially just processes with a shared address space on Linux.
/usr/include/bits/local_lim.h
Hard vs Soft Links
From implementation perspective:
Hard links
A hard link is a pointer to the file's i-node.
Soft links (symbolic links):
A symbolic link is a special type of file that contains a reference to another file or directory in the form of an absolute or relative path and that affects pathname resolution.
It merely contains an absolute or relative path to another file or directory.
It is a file on its own and can exist independently of its target.
Difference between Hard link and Soft link
1. Normally, hard links can't link to directories -- this is to avoid endless recursion and confusion. In stead, soft links can link to directories.
2. Hard links can't cross file system boundaries but Soft links can cross file system boundaries - Explain this based on how they are implemented.
3. After original file is moved to another place or deleted, hard links still refer to the original file, soft links are invalidated.
4. A file can have multiple hard links, after all hard links are deleted, the file content is removed.

A single inode number use to represent file in each file system. Hard links are based upon inode number, all hard links have same inode number as original file.
If we can create a hard link between different file systems, the file system would be unable to determine whether this inode refers to its filesystem or another filesystem.
Each file system has a superblock, which contains metadata about file system such as: file system type, size, status, the inode counts, block counts, free blocks, date of the last file system check and etc.
A superblock is located at position 0 of every partition.

As this information is very important, if it's lost, data would be lost, so Linux maintains multiple redundant copies of the superblock in every file system. In emergency case, we can use backup copies to restore damaged primary super block.
How to Restore a Bad Superblock?
There are several backups superblock stored on the harddisk in well-known place.
We can use fsck or # e2fsck -b 32768 /dev/sda5 to restore from backup superblocks.
We can use "dumpe2fs /dev/sda5 | grep -i superblock" command to find primary and backup superblocks.

Resources
Post a Comment

Labels

Java (159) Lucene-Solr (112) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (38) Eclipse (33) Code Example (31) Linux (25) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) J2EE (13) Network (13) Troubleshooting (13) PowerShell (11) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) Problem Solving (9) UIMA (9) html (9) Http Client (8) Maven (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) Shell (7) ANT (6) Coding Skills (6) Database (6) Lesson Learned (6) Programmer Skills (6) Scala (6) Tips (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) System Design (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts