Sunday, March 1, 2009
CH-8630 Rüti, Switzerland
ste.forster at gmail.com
March 1, 2009
1. Threads and Programming Languages
Programming languages have come a long way making programming safer and applications more stable.
Especially C# is a language designed for this purpose. Intelligent concepts allow the compiler to find many errors in advance but still compile very quick very high performing code.
But the concept of threads has not been incorporated comparably well.
Threads are multiple parallel execution paths through the code. Each thread may pass through many assemblies, namespaces, classes, public and private methods. There are no higher level concepts about threads. It is up to the programmer to know allowed and forbidden method calls for each thread.
Thread programming errors may lead to unpredictable results like data errors possibly followed by fatal application exceptions or endless blocked threads (race-conditions and deadlocks).
Thread programming errors are especially difficult to find - but easy to introduce by inexperienced programmers.
2. Threadsafe Architectures
There are many concepts and ways to address the problem of thread safety. I mention here three of them that are available to C# programmers.
2.1 Critical Sections
Critical sections are code sections that allow only one thread per synchronization object to enter. Other threads with the same synchronization object are blocked until the one entered has left the critical section.
Synchronization objects may be chosen to protect a single data member, all data of an object or all objects of a class or a set of classes.
Programmers may use the lock-statement to establish critical sections inside their methods.
Critical sections protect object member data from quasi simultaneous changes and back-changes by several threads operating on the same data.
It is difficult for a programmer to find all places where a critical section is needed.
Multiple critical sections may lead to deadlocks. A situation that is very difficult to foresee.
Critical sections are a performance penalty if called very often.
Guidelines from experience:
Use critical sections as little as possible and only in conceptually clear situations.
Be very careful what synchronization object to reference.
2.2 Threadbound Objects
The situation gets much clearer when only one thread may operate on a set of objects.
These objects and a thread are bound together like a single threaded sub-application.
Several of these sub-applications may operate in parallel. They exchange messages through queues instead of directly calling methods.
Microsoft .NET WinForms uses this concept and checks for foreign threads trying to access data bound to its own thread.
Inter thread communication:
To send data from one thread bound object X to an object Y bound to another thread, one has to write data into a queue-B that is read by the other thread.
To call a method of another threads object Y, one has to enqueue method (delegate) and parameters into queue-B that is read by thread B. These methods may not directly return a value. Return values are sent asynchronously by a call-back method from thread B through queue-A. Interface methods that send delegates to queue B may be grouped in an interface class If-B.
To call a method and wait for the synchronous return value one has to do the same as above plus implement a blocking wait until the thread B has created the return value.
Asynchronous responses should be favoured over synchronous calls as there is no opportunity for deadlocks, no blocked user interface and parallel running threads can process more on a multi-CPU engine.
Critical sections as in Concept 2.1 must not be used to protect data accessed by a public method.
This because all other methods accessing the same data would need critical sections too. This way opportunity for deadlocks would be created.
The architecture leads to clean, parallel operating modules.
Deadlocks and data corruption can be avoided by design, when only asynchronous method calls are allowed.
The architecture must be enforced starting from beginning of a project.
Interface classes and queues must be programmed manually.
It is still possible to directly call methods when references to the other threads objects are passed.
The proposal of chapter 3 is intended to eliminate these disadvantages.
2.3 Parallel Computing
Visual Studio 2010 will have new features for parallel computing: A “Task” library, parallel LINQ and debugging support.
By pushing down the thread safety aspects into the library it is possible to make the application thread safe, but Microsoft explains that there is still a need for better integration of inter-thread messaging.
Other proposals at Microsoft are a new programming language (Maestro) and enhancements to Visual Basic (Concurrent Basic). In my opinion both proposals cannot guarantee thread safety as it is easy to pass references and illegally access objects belonging to another thread.
3 Proposed Thread Save Programming in .NET
Passing references and accessing data is at the heart of programming. It must be allowed. But the object oriented constructs for data isolation (class, public, private...) are not designed for thread safety.
Object orientation is concerned with data and code modelling. Threading is an entire different dimension of application composition.
There must be new constructs in the language to support thread safety.
.NET CLI attributes are a possibility to store threading information in a .NET assembly.
Using these attributes C# and other CLI compilers can make thread programming much safer.
The compiler can check if the programmer has made threading errors and can generate code to make messaging as easy as a method call.
The next paragraphs describe the proposed attributes and compiler enhancements.
3.1 Threadbound Attribute
A class may be attributed with [Threadbound(Thread=T)]
Depending on the definition of T, the thread can be resolved during compile time or during runtime. This will influence the code generation (static- or runtime checks).
The referenced thread T must be of a new type containing an input message queue.
Properties of the thread allow setting of maximum queue size and behaviour on queue overflow.
3.2 QueuedOperation- and BlockingQueuedOperation Attribute
A method of a threadbound object may be attributed with [QueuedOperation]
This leads to automatic thread safe interface code generation to invoke the method through a queued delegate.
Methods without return value, ref or out parameters are executed non-blocking, asynchronous.
Other methods block the calling thread until return values have been generated by the other thread. These methods may lead to deadlocks! To distinguish the different situations I propose another attribute for these methods: [BlockingQueuedOperation]
3.3 DirtyReadOperation Attribute
A method of a threadbound class may be attributed with [DirtyReadOperation]
This tells the compiler, that the class user takes responsibility for spurious output. These methods may not alter object state.
3.4 ThreadsafeOperation Attribute
A method of a threadbound class may be attributed with [ThreadsafeOperation]
Inside this method only members with [ThreadsafeData] attribute may be accessed.
An example for this attribute is a method that writes to or reads from a second thread safe queue.
The compiler checks if a method is re-entrant and automatically applies an attribute having similar effect as [ThreadSafeOperation].
3.5 ThreadsafeData Attribute
A data member of a threadbound object may be attributed with [ThreadsafeData]
Any access to this data must be in critical sections of the same synchronization object.
Compile time checks may allow access outside of critical sections, when there are only atomic operations.
3.6 More Compile Time Checks
When the thread of a treadbound object can be resolved during compile time, the following checks are done by the compiler to enforce isolation between threads:
If an object is bound to thread T, all called, not threadbound objects are marked as threadbound to T.
If a threadbound object is called from another thread, the method must be attributed with one of the above threadsafe-attributes.
3.7 Runtime Checks
The compiler can be instructed to generate code that detects programming errors during runtime:
Public, internal or virtual methods of a threadbound object that are not attributed with one of the above threadsafe-attributes are checked whether the call is from the right thread.
- Each object needs a reference to "it's" thread. Normally the reference is automatically set in the constructor.
The user can change or reset this binding when construction and execution is on different threads.
- Eventhandlers are called on threading errors. The user may allow these events in some application states.
This way an isolation between multiple threads is enforced at least at runtime.
The proposed .NET - CLI attributes would allow modelling the thread concept of an application.
Compilers and other tools could take advantage of this knowledge to find programming errors and enforce standards.
The proposal is intended to promote asynchronous messaging as in concept 2.2.
Automatic code generation for inter-thread messaging makes programming easier and safer.
The existing .NET library API can be used unchanged. An enhanced thread class with message queue would be needed as well as a thread reference for each object.
This is just a very short abstract of a wide and complex field. Much work would have to be done to design and improve the compiler for thread safety.
5 Motivation and Release History
In December 2008 I created the first version of my proposal "Thread Safe Programming Language".
Further investigation showed that Microsoft is working on the same subject and has already implemented and presented new features of Visual Studio 2010 especially in the field of parallel computing targeting multi core engines and higher performance.
To share my knowledge with people that will shape the future of thread safe programming I decided in February 2009 to publish the second version of my proposal "Thread Safe Programming Language".
On March 1. 2009 I added a better introduction (3), comments about re-entrancy (3.4), details on runtime checks (3.7), this release history and a links section.
Recommended google search patterns:
- "thread safe programming", "thread safe programming language"
- http://en.wikipedia.org/wiki/Thread-safety , some definitions
- http://www.codeproject.com/KB/cs/threadsafeforms, calling WinForms from another thread.
blogspot.com, my blog (not much there!)
Interesting links at Microsoft.com:
- http://blogs.msdn.com/pfxteam , the Microsoft parallel computing blog
- http://blogs.msdn.com/pfxteam/archive/2009/02/11/9413351, an overview.
- http://msdn.microsoft.com/en-us/concurrency, the Microsoft parallel computing homepage.
- http://channel9.msdn.com/pdc2008/TL26, parallel computing in Visual Studio 2010.
Maestro, a (meta-) language designed for multithreading.
Visual Basic enhancements for inter thread communication.