GENERICS Sasha Goldshtein blog.sashag.net | @goldshtn
What Is This Talk? • Understanding how generics are implemented in C++, Java and .NET at the runtime and machine-code level • Understanding the performance implications and other pros/cons of each mechanism • We will not learn how to use generics
Why Do We Want Them? • “Pure” object-oriented programming does not always provide a clean and type-safe solution with good performance • In other words, what’s wrong here? public class ArrayList { object[] items; public void Add(object item) { ... } public object ElementAt(int index) { ... } }
The C++ Approach “Templates, the smart macros from hell.” • Use parameterized template as a sketch • No constraints on the original template code • Everything happens at compile-time template <typename RanIt> void sort(RanIt begin, RanIt end) { … if (*begin < *(begin+1)) … }
C++ Template Definition template <typename T> class vector { T* data; int size; int cap; public: vector(int capacity) { ... } void push_back(const T& datum) { ... } T operator[](int index) const { ... } };
C++ Template Instantiation You say: vector<int> v(2); Compiler says: class __vector__int__ { int* data; int size; int cap; public: vector(int capacity) { ... } };
C++ Template Instantiation You say: vector<int> v(2); v.push_back(42); Compiler says: class __vector__int__ { int* data; int size; int cap; public: vector(int capacity) { ... } void push_back(const int& datum) { ... } };
C++ Template Instantiation You say: vector<EmptyClass> v(2); sort(v.begin(), v.end()); Compiler says: error C2784: 'bool std::operator <(const std::vector<_Ty,_Alloc> &,const std::vector<_Ty,_Alloc> &)' : could not deduce template argument for 'const std::vector<_Ty,_Alloc> &' from 'EmptyClass' vector(1724) : see declaration of 'std::operator <' templatesstuff.cpp(20) : see reference to function template instantiation 'void sort<std::_Vector_iterator<_Myvec>>(RanIt,RanIt)' being compiled with [ _Myvec=std::_Vector_val<std::_Simple_types<EmptyClass>>, RanIt=std::_Vector_iterator<std::_Vector_val< std::_Simple_types<EmptyClass>>> ]
The C++ Approach—Pros and Cons Pros Cons • No performance cost • Can’t share templates • Very flexible between translation • Full compile-time type units safety • Can’t share templates between libraries (code bloat) • Can’t reliably export templates from libraries • No constraints = no readable compiler errors
The Java Approach • Use parameterized template as a compiler aid • Constraints used to prove things to the compiler • Erase type information at runtime public class LinkedList<E> { private LinkedList<E> head; private E value; public void add(E element) { ... } public E getAt(int index) { ... } }
Java Generic Type Erasure There is just one type (raw type) at runtime: public class LinkedList { private LinkedList head; private Object value; public void add(Object element) { ... } public Object getAt(int index) { ... } }
Java Generic Type Constraints Cannot use anything but java.lang.Object methods without specifying constraint (wildcard): public class SortedList<E extends Comparable<E>> { ... public void add(E element) { ... if (element.compareTo(other)) ... } }
The Java Approach—Pros and Cons Pros Cons • Backwards compatible • Can’t use generics with with non-generic Java primitive types versions • Can’t distinguish • Constraint violation between generic class results in clear instantiations compiler error • Can’t instantiate • Can share generic generic type types and objects parameters (“new E”) between • Can’t use type packages/applications parameters in static methods or fields
The .NET Approach • Use parameterized template as a compiler aid and a runtime code generation sketch for the JIT • Constraints used to prove things to the compiler public class List<T> { T[] items; int size; int cap; public void Add(T item) { ... } public T this[int index] { get { ... } set { ... } } }
Digression: .NET Object Layout
.NET Generic Types at Runtime • There is a separate type at runtime for each generic instantiation, but not necessarily a separate copy of the methods’ code • Does this method’s machine code depend on T? public void Add(T item) { if (size < items.Length – 1) { items[size] = item; ++size; } else AllocateAndAddSlow(item); }
.NET Generic Code Sharing
Concrete Example: Stack Push BasicStack`1[[System.__Canon, mscorlib]].Push(System.__Canon) 00260360 57 push edi 00260361 56 push esi 00260362 8b7104 mov esi,dword ptr [ecx+4] 00260365 8b7908 mov edi,dword ptr [ecx+8] 00260368 8d4701 lea eax,[edi+1] 0026036b 894108 mov dword ptr [ecx+8],eax 0026036e 52 push edx 0026036f 8bce mov ecx,esi 00260371 8bd7 mov edx,edi 00260373 e8f4cb3870 call clr!JIT_Stelem_Ref (705ecf6c) 00260378 5e pop esi 00260379 5f pop edi 0026037a c3 ret
Concrete Example: Stack Push BasicStack`1[[System.Int32, mscorlib]].Push(Int32) 002603c0 57 push edi 002603c1 56 push esi 002603c2 8b7104 mov esi,dword ptr [ecx+4] 002603c5 8b7908 mov edi,dword ptr [ecx+8] 002603c8 8d4701 lea eax,[edi+1] 002603cb 894108 mov dword ptr [ecx+8],eax 002603ce 3b7e04 cmp edi,dword ptr [esi+4] 002603d1 7307 jae 002603da 002603d3 8954be08 mov dword ptr [esi+edi*4+8],edx 002603d7 5e pop esi 002603d8 5f pop edi 002603d9 c3 ret 002603da e877446170 call clr!JIT_RngChkFail (70874856) 002603df cc int 3
Concrete Example: Stack Push BasicStack`1[[System.Double, mscorlib]].Push(Double) 00260420 56 push esi 00260421 8b5104 mov edx,dword ptr [ecx+4] 00260424 8b7108 mov esi,dword ptr [ecx+8] 00260427 8d4601 lea eax,[esi+1] 0026042a 894108 mov dword ptr [ecx+8],eax 0026042d 3b7204 cmp esi,dword ptr [edx+4] 00260430 730c jae 0026043e 00260432 dd442408 fld qword ptr [esp+8] 00260436 dd5cf208 fstp qword ptr [edx+esi*8+8] 0026043a 5e pop esi 0026043b c20800 ret 8 0026043e e813446170 call clr!JIT_RngChkFail (70874856) 00260443 cc int 3
Type-Specific Code • What about new T[12] or typeof(T).FullName? • When .NET generic methods need access to T, they get it from the method table (this or hidden parameter) • …Unless the type parameters are value types, in which case the MT is hard-coded into the method: C#: Foo<T>() { … typeof(T) … } T=int Machine code: mov ecx,offset 798b6844 (MT: System.Int32) call clr!JIT_GetRuntimeType (6ca40aa8)
Generics and Reflection • Because generic types are first-class citizens, they are accessible to Reflection at runtime Type to = typeof(Dictionary<,>); Type tc = to.MakeGenericType( typeof(string), typeof(int)); to = typeof(List<double>).GetGenericTypeDefinition(); tc = to.MakeGenericType(typeof(int)); //List<int>
Generic Constraints • .NET constraints restrict type parameters at compile- time, very similar to Java’s • Only a limited set of constraints available: • Interface constraint: where T : IComparable<T> • Base constraint: where T : UserControl • Category constraint: where T : class or where T : struct • Constructor constraint: where T : new() Note that constraints don’t break the machine code equivalence for reference types. Why?
Case Study: IEquatable<T> public static void CallEquals<T>(T inst) { inst.Equals(inst); } public struct Point { public int X, Y; public override bool Equals(object o) { if (o is Point) return Equals((Point)o); return false; } public bool Equals(Point pt) { ... } }
Case Study: IEquatable<T> • CallEquals has no constraints, so the C# compiler chooses the Object.Equals(Object) virtual method • We can add an interface constraint with a strongly-typed Equals method—now the compiler prefers it • Note: the interface call has no virtual cost on value types public static void CallEquals<T>(T inst) where T : IEquatable<T> { inst.Equals(inst); }
Sorting “If Possible”, a la C++ public class List<T> { T[] items; ... public void Add(T item) { ... } public void Sort(SortProvider<T> sorter = null) { sorter = sorter ?? SortProvider<T>.GetDefault(); if (sorter == null) throw new NotImplementedException(); sorter.Sort(items); } }
Sorting “If Possible”, a la C++ public abstract class SortProvider<T> { public abstract void Sort(T[] items); public static SortProvider<T> GetDefault() { if (T is IComparable<T>) return new DefaultSortProvider<T>(); if (T is IGreaterthanable<T>) return new GreaterThanSortProvider<T>(); return null; } } internal class DefaultSortProvider<T> : SortProvider<T> where T : IComparable<T> { //Use T.CompareTo for sorting }
Getting Generic Math Right in .NET • Pretty nasty: • Consider Complex<T>: you can’t implement operators… • Solution sketch: • Define ICalculator<T> with methods instead of operators • Implement ICalculator<T> for each T • Choose between ICalculator<T>’s implementations at runtime, and use them in your generic math code • For more: http://www.codeproject.com/Articles/8531/Using-generics-for-calculations
The .NET Approach—Pros and Cons Pros Cons • Constraint violation • Constraints are not results in clear compiler enough for everything error (e.g., generic math) • Can share generic types • No meta-programming and objects between abilities (advantage?) packages/applications • Can use generics efficiently with value types • Can use Reflection to query over generic types
QUESTIONS? Sasha Goldshtein blog.sashag.net | @goldshtn

Generics in .NET, C++ and Java

  • 1.
  • 2.
    What Is ThisTalk? • Understanding how generics are implemented in C++, Java and .NET at the runtime and machine-code level • Understanding the performance implications and other pros/cons of each mechanism • We will not learn how to use generics
  • 3.
    Why Do WeWant Them? • “Pure” object-oriented programming does not always provide a clean and type-safe solution with good performance • In other words, what’s wrong here? public class ArrayList { object[] items; public void Add(object item) { ... } public object ElementAt(int index) { ... } }
  • 4.
    The C++ Approach “Templates, the smart macros from hell.” • Use parameterized template as a sketch • No constraints on the original template code • Everything happens at compile-time template <typename RanIt> void sort(RanIt begin, RanIt end) { … if (*begin < *(begin+1)) … }
  • 5.
    C++ Template Definition template<typename T> class vector { T* data; int size; int cap; public: vector(int capacity) { ... } void push_back(const T& datum) { ... } T operator[](int index) const { ... } };
  • 6.
    C++ Template Instantiation Yousay: vector<int> v(2); Compiler says: class __vector__int__ { int* data; int size; int cap; public: vector(int capacity) { ... } };
  • 7.
    C++ Template Instantiation Yousay: vector<int> v(2); v.push_back(42); Compiler says: class __vector__int__ { int* data; int size; int cap; public: vector(int capacity) { ... } void push_back(const int& datum) { ... } };
  • 8.
    C++ Template Instantiation Yousay: vector<EmptyClass> v(2); sort(v.begin(), v.end()); Compiler says: error C2784: 'bool std::operator <(const std::vector<_Ty,_Alloc> &,const std::vector<_Ty,_Alloc> &)' : could not deduce template argument for 'const std::vector<_Ty,_Alloc> &' from 'EmptyClass' vector(1724) : see declaration of 'std::operator <' templatesstuff.cpp(20) : see reference to function template instantiation 'void sort<std::_Vector_iterator<_Myvec>>(RanIt,RanIt)' being compiled with [ _Myvec=std::_Vector_val<std::_Simple_types<EmptyClass>>, RanIt=std::_Vector_iterator<std::_Vector_val< std::_Simple_types<EmptyClass>>> ]
  • 9.
    The C++ Approach—Prosand Cons Pros Cons • No performance cost • Can’t share templates • Very flexible between translation • Full compile-time type units safety • Can’t share templates between libraries (code bloat) • Can’t reliably export templates from libraries • No constraints = no readable compiler errors
  • 10.
    The Java Approach •Use parameterized template as a compiler aid • Constraints used to prove things to the compiler • Erase type information at runtime public class LinkedList<E> { private LinkedList<E> head; private E value; public void add(E element) { ... } public E getAt(int index) { ... } }
  • 11.
    Java Generic TypeErasure There is just one type (raw type) at runtime: public class LinkedList { private LinkedList head; private Object value; public void add(Object element) { ... } public Object getAt(int index) { ... } }
  • 12.
    Java Generic TypeConstraints Cannot use anything but java.lang.Object methods without specifying constraint (wildcard): public class SortedList<E extends Comparable<E>> { ... public void add(E element) { ... if (element.compareTo(other)) ... } }
  • 13.
    The Java Approach—Prosand Cons Pros Cons • Backwards compatible • Can’t use generics with with non-generic Java primitive types versions • Can’t distinguish • Constraint violation between generic class results in clear instantiations compiler error • Can’t instantiate • Can share generic generic type types and objects parameters (“new E”) between • Can’t use type packages/applications parameters in static methods or fields
  • 14.
    The .NET Approach •Use parameterized template as a compiler aid and a runtime code generation sketch for the JIT • Constraints used to prove things to the compiler public class List<T> { T[] items; int size; int cap; public void Add(T item) { ... } public T this[int index] { get { ... } set { ... } } }
  • 15.
  • 16.
    .NET Generic Typesat Runtime • There is a separate type at runtime for each generic instantiation, but not necessarily a separate copy of the methods’ code • Does this method’s machine code depend on T? public void Add(T item) { if (size < items.Length – 1) { items[size] = item; ++size; } else AllocateAndAddSlow(item); }
  • 17.
  • 18.
    Concrete Example: StackPush BasicStack`1[[System.__Canon, mscorlib]].Push(System.__Canon) 00260360 57 push edi 00260361 56 push esi 00260362 8b7104 mov esi,dword ptr [ecx+4] 00260365 8b7908 mov edi,dword ptr [ecx+8] 00260368 8d4701 lea eax,[edi+1] 0026036b 894108 mov dword ptr [ecx+8],eax 0026036e 52 push edx 0026036f 8bce mov ecx,esi 00260371 8bd7 mov edx,edi 00260373 e8f4cb3870 call clr!JIT_Stelem_Ref (705ecf6c) 00260378 5e pop esi 00260379 5f pop edi 0026037a c3 ret
  • 19.
    Concrete Example: StackPush BasicStack`1[[System.Int32, mscorlib]].Push(Int32) 002603c0 57 push edi 002603c1 56 push esi 002603c2 8b7104 mov esi,dword ptr [ecx+4] 002603c5 8b7908 mov edi,dword ptr [ecx+8] 002603c8 8d4701 lea eax,[edi+1] 002603cb 894108 mov dword ptr [ecx+8],eax 002603ce 3b7e04 cmp edi,dword ptr [esi+4] 002603d1 7307 jae 002603da 002603d3 8954be08 mov dword ptr [esi+edi*4+8],edx 002603d7 5e pop esi 002603d8 5f pop edi 002603d9 c3 ret 002603da e877446170 call clr!JIT_RngChkFail (70874856) 002603df cc int 3
  • 20.
    Concrete Example: StackPush BasicStack`1[[System.Double, mscorlib]].Push(Double) 00260420 56 push esi 00260421 8b5104 mov edx,dword ptr [ecx+4] 00260424 8b7108 mov esi,dword ptr [ecx+8] 00260427 8d4601 lea eax,[esi+1] 0026042a 894108 mov dword ptr [ecx+8],eax 0026042d 3b7204 cmp esi,dword ptr [edx+4] 00260430 730c jae 0026043e 00260432 dd442408 fld qword ptr [esp+8] 00260436 dd5cf208 fstp qword ptr [edx+esi*8+8] 0026043a 5e pop esi 0026043b c20800 ret 8 0026043e e813446170 call clr!JIT_RngChkFail (70874856) 00260443 cc int 3
  • 21.
    Type-Specific Code • Whatabout new T[12] or typeof(T).FullName? • When .NET generic methods need access to T, they get it from the method table (this or hidden parameter) • …Unless the type parameters are value types, in which case the MT is hard-coded into the method: C#: Foo<T>() { … typeof(T) … } T=int Machine code: mov ecx,offset 798b6844 (MT: System.Int32) call clr!JIT_GetRuntimeType (6ca40aa8)
  • 22.
    Generics and Reflection •Because generic types are first-class citizens, they are accessible to Reflection at runtime Type to = typeof(Dictionary<,>); Type tc = to.MakeGenericType( typeof(string), typeof(int)); to = typeof(List<double>).GetGenericTypeDefinition(); tc = to.MakeGenericType(typeof(int)); //List<int>
  • 23.
    Generic Constraints • .NETconstraints restrict type parameters at compile- time, very similar to Java’s • Only a limited set of constraints available: • Interface constraint: where T : IComparable<T> • Base constraint: where T : UserControl • Category constraint: where T : class or where T : struct • Constructor constraint: where T : new() Note that constraints don’t break the machine code equivalence for reference types. Why?
  • 24.
    Case Study: IEquatable<T> public static void CallEquals<T>(T inst) { inst.Equals(inst); } public struct Point { public int X, Y; public override bool Equals(object o) { if (o is Point) return Equals((Point)o); return false; } public bool Equals(Point pt) { ... } }
  • 25.
    Case Study: IEquatable<T> •CallEquals has no constraints, so the C# compiler chooses the Object.Equals(Object) virtual method • We can add an interface constraint with a strongly-typed Equals method—now the compiler prefers it • Note: the interface call has no virtual cost on value types public static void CallEquals<T>(T inst) where T : IEquatable<T> { inst.Equals(inst); }
  • 26.
    Sorting “If Possible”,a la C++ public class List<T> { T[] items; ... public void Add(T item) { ... } public void Sort(SortProvider<T> sorter = null) { sorter = sorter ?? SortProvider<T>.GetDefault(); if (sorter == null) throw new NotImplementedException(); sorter.Sort(items); } }
  • 27.
    Sorting “If Possible”,a la C++ public abstract class SortProvider<T> { public abstract void Sort(T[] items); public static SortProvider<T> GetDefault() { if (T is IComparable<T>) return new DefaultSortProvider<T>(); if (T is IGreaterthanable<T>) return new GreaterThanSortProvider<T>(); return null; } } internal class DefaultSortProvider<T> : SortProvider<T> where T : IComparable<T> { //Use T.CompareTo for sorting }
  • 28.
    Getting Generic MathRight in .NET • Pretty nasty: • Consider Complex<T>: you can’t implement operators… • Solution sketch: • Define ICalculator<T> with methods instead of operators • Implement ICalculator<T> for each T • Choose between ICalculator<T>’s implementations at runtime, and use them in your generic math code • For more: http://www.codeproject.com/Articles/8531/Using-generics-for-calculations
  • 29.
    The .NET Approach—Prosand Cons Pros Cons • Constraint violation • Constraints are not results in clear compiler enough for everything error (e.g., generic math) • Can share generic types • No meta-programming and objects between abilities (advantage?) packages/applications • Can use generics efficiently with value types • Can use Reflection to query over generic types
  • 30.