Friday, September 23, 2011

Recursion in Linq

Being able to traverse a tree structure can be very handy, or flatten a tree to a list. Linq doesn't offer this out of the box, but it's a fairly straightforward extension to add.


/// <summary>
/// Recusively traverse a tree structure
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="collection"></param>
/// <param name="childrenSelector">Child navigation property on T</param>
/// <returns>Flattened collection</returns>
public static IEnumerable<T> Traverse<T>(this IEnumerable<T> collection, Func<T, IEnumerable<T>> childrenSelector)
{
    foreach (var item in collection)
    {
        yield return item;

        var children = childrenSelector(item);

        if (children != null)
        {
            foreach (var child in Traverse(children, childrenSelector))
            {
                yield return child;
            }
        }
    }
}

The source is pretty much straight from here, with an extra check against null children collections.

Thursday, July 7, 2011

Entry Extensions

Entity Framework 4.1 CodeFirst is quite a bit of fun to use, but tucks away some of the useful database functionality. One useful thing is to know if a property on a database entry has been changed since it was loaded. DbEntityEntry exposes OriginalValues and CurrentValues, but actually comparing the two can be streamlined with Expressions.

DbEntityEntry.PropertyChanged

public static class DbEntityEntryExtensions
{
    /// <summary>
    /// Determine if a property has changed since being loaded.
    /// <remarks>
    /// If comparing a property that is a sub-object against another instance, it will always throw true (changed).
    /// Complex properties should implement IEquatable to properly handle how equality is determined.
    /// </remarks>
    /// </summary>
    /// <typeparam name="TEntity"></typeparam>
    /// <typeparam name="TProperty"></typeparam>
    /// <param name="entity"></param>
    /// <param name="expression"></param>
    /// <returns></returns>
    public static bool PropertyChanged<TEntity, TProperty>(this DbEntityEntry<TEntity> entity, Expression<Func<TEntity, TProperty>> expression)
        where TEntity : class
    {
        var name = ((MemberExpression)expression.Body).Member.Name;

        return !entity.CurrentValues.GetValue<TProperty>(name).Equals(entity.OriginalValues.GetValue<TProperty>(name));
    }
}

Determining if a loaded POCO property has changed is pretty easy:

var o = MyDbContext.TestClasses.Find(1);
o.Id = 2;

var changed = MyDbContext.Entry(o).PropertyChanged(c => c.Id);   // True

This works fine for primitive types such as int, string and float. However, if your property is another object, it won't work. This is because at it's heart, we're calling Equals.


class TestClass
{
    public int Id { get; set; }
}

var a = new TestClass { Id = 1 };
var b = new TestClass { Id = 1 };

var equal = a.Equals(b);  // False!

To show this with EF 4.1...


class TestClass
{
    public int Id { get; set; }
    public AssociationClass Association { get; set; }
}

class AssociationClass
{
    public int Id { get; set; }
}

// Assume that the stored o.Association has an ID of 1
var o = var o = MyDbContext.TestClasses.Find(1);

// Set to a new, identical instance of the class
o.Association = new AssociationClass { Id = 1 };

// This is supposed to be false.  The database holds that TestClass:1
// is associated with AssociationClass:1.  However, since the objects
// are different, Equals will return false, and PropertyChanged will
// return true - thinking that the association has changed.
var changed = MyDbContext.Entry(o).PropertyChanged(c => c.Association);   // True (wrong)

There is probably a way to check if the property is an association and check possible entity keys, but that's for another time.

Tuesday, April 19, 2011

EntitySystems Part 3: EntityCollections

In part one we looked at the base for an Entity and EntityManager. In there we can create new entities and compose them of components. In part 2, we looked at actually storing components by their type.

When viewing an Entity System as a relational database, the component types are single table with the entity id as the primary key. Then entity represents a single joined row across multiple tables. Systems that consume entities will often rely on multiple component types. We could reference the component collections but that would require a lot of duplicate checking in each system to ensure that an entity contained all types it was looking for. Better to create a new type that tracks this for us.

EntityCollection

public sealed class EntityCollection : IEnumerable
{
    readonly EntityManager _manager;

    readonly Type[] _filter;
    readonly int _hashCode;

Each collection is composed a collection of types, so the first step is to store all of these types. These are used to make the hashcode for the collection. More explained in a bit.

    readonly SortedSet _sortedEntities;
    readonly Dictionary _entities;
    readonly IComparer _comparer;

To store entities in the collection, we use two different internal collections. An entity collection has two big requirements - O(1) lookup and sorting. The sorted collections in C# don't offer O(1) lookup time and a plain Dictionary can't be sorted. The solution - use two collections. Dictionary offers the fastest lookup and SortedSet offers the most consistent insert speed. The comparer is just that - an optional custom comaprer for sorted entries (think z-indexed sprites, etc...)

internal EntityCollection(EntityManager entityManager, Type[] types)
{
    _sortedEntities = new SortedSet();
    _hashCode = EntityCollection.GetHashCode(types, null);

    // Check that all types are really components
    if (!types.All(t => typeof(EntityComponent).IsAssignableFrom(t)))
    {
        throw new Exception("Type is not of IComponent - cannot filter.");
    }

    _manager = entityManager;
    _entities = new Dictionary();
    _filter = types;

    // Add any existing entites
    foreach (var entity in _manager)
    {
        if (MatchesFilter(entity))
        {
            Add(entity);
        }
    }
}

internal EntityCollection(EntityManager entityManager, Type[] types, Func comparer)
    : this(entityManager, types)
{
    // 
    var lambdaComparer = new Engine.Tools.LambdaComparer(comparer);

    _sortedEntities = new SortedSet(lambdaComparer);
    _hashCode = EntityCollection.GetHashCode(types, lambdaComparer);
}

internal EntityCollection(EntityManager entityManager, Type[] types, IComparer comparer)
    : this(entityManager, types)
{
    _sortedEntities = new SortedSet(comparer);
    _hashCode = EntityCollection.GetHashCode(types, comparer);
}

internal bool MatchesFilter(Entity entity)
{
    return _filter.All(t => entity.Is(t));
}

The constructor and most of the manipulations are internal. EntityCollections are not meant to be added to and removed from by the game classes. Systems that consume EntityCollection should only be iterating the collection. Use the EntityManager to manipulate entities - they will be automatically added/removed from EntityCollections.

Upon creation, the entity collection will search every entity and determine if it fits into the collection. Later as new entites are created or components removed, the EntityManager is responsible for notifiying collections about updates.

Seems like a good point to discus creating the hash code.

public static int GetHashCode(Type[] types, IComparer comparer)
{
    var hashCode = 0;

    // Hashcode is build off the filters, independent of their order
    for (int i = 0; i < types.Length; i++)
        hashCode += types[i].GetHashCode();

    // Add on comparer hashcode if set - comparers should overload their
    // own GetHashCode to return a unique constant.
    hashCode += (comparer == null) ? 0 : comparer.GetHashCode();

    return hashCode;
}

Different consumers of EntityCollections may draw upon the same types. Rather than have multiple collections representing the same types, we provide a mechanism to cache them in the EntityManager (where they're built). The hash code is built from the types being monitored as well as the comparer.

public delegate void EntityAddHandler(Entity entity);
public delegate void EntityRemoveHandler(Entity entity);

Systems consuming the collection like to be notified when the collection changes.

internal bool Add(Entity entity)
{
    // Check if the collection already contains this
    if (_entities.ContainsKey(entity.Id))
        return;

    // Ensure that the entity matches the filter
    if (!MatchesFilter(entity))
        return false;

    _entities.Add(entity.Id, entity);
    _sortedEntities.Add(entity);

    if (_entities.Count != _sortedEntities.Count)
        throw new Exception("EntityCollection internal collections don't match.  Probably a bad comparer for SortedSet.");

    // Notify any listeners that a new entity was added
    if (OnEntityAdd != null)
    {
        OnEntityAdd(entity);
    }
}
internal void Remove(Entity entity)
{
    // Check if the collection actually contains the entity
    if (!_entities.ContainsKey(entity.Id))
        return;

    // Notify any listeners that an entity was removed
    if (OnEntityRemove != null)
    {
        OnEntityRemove(entity);
    }

    _sortedEntities.Remove(entity);
    _entities.Remove(entity.Id);
}

Finally, Add and Remove do just that as well as notifiy any listeners. Notification is used to create or release resources or do extra initialization such as loading assets

There's a few more utility functions to be found in the full source.

EntityManager

The EntityManager needs to have a few changes made to be able to present EntityCollections. First, add the internal collection storage and add two internal functions to handle collections.

readonly Dictionary _collections;

internal void RemoveFromCollections(Entity e)
{
    // Remove entity from any collections
    foreach (var collection in _collections.Values)
    {
        collection.Remove(e);
    }
}

internal void RemoveFromCollections(Entity e, Type t)
{
    // Remove entity from any collections
    foreach (var collection in _collections.Values)
    {
        if (collection.ContainsType(t))
        {
            collection.Remove(e);
        }
    }
}

Second, we actually call these where appropriate.

public Entity Create(IEnumerable components)
{
    ...
    AddToCollections(e);
    ...
}
public void Remove(Entity entity)
{
    RemoveFromCollections(entity);     // Call this one first
    Components.Remove(entity.Id);
    _entities.Remove(entity.Id);
}

And lastly, we need way to actually get the collections.

public EntityCollection Get(params Type[] componentTypes)
{
    var hashCode = EntityCollection.GetHashCode(componentTypes);

    if (!_collections.ContainsKey(hashCode))
    {
        _collections.Add(hashCode, new EntityCollection(this, componentTypes));
    }

    return _collections[hashCode];
}

public EntityCollection Get(Type[] componentTypes, IComparer comparer)
{
    var hashCode = EntityCollection.GetHashCode(componentTypes, comparer);

    if (!_collections.ContainsKey(hashCode))
    {
        _collections.Add(hashCode, new EntityCollection(this, componentTypes, comparer));
    }

    return _collections[hashCode];
}

Using Collections

public MySystem(EntityManager manager)
{
    // Get the EntityCollection
    var entities = _entites = manager.Get(
        typeof(Foo),
        typeof(Bar));

    entities.OnEntityAdd += LoadComponent;
    entities.OnEntityAdd += UnloadComponent;
}

public void Draw(GameTime gameTime)
{
    foreach (var entity in _entities)
    {
        // Draw
    }
}

public void LoadComponent(Entity e)
{
    e.As().Texture = _contentManager.Load(e.As().AssetName);
}

public void UnloadComponent(Entity e)
{
    // Unload component
}

Consuming collections is pretty straight forward. The filter is created, events set, and it's good to go.

Tuesday, January 4, 2011

Sorted Collection Speeds

C# and .Net offer a couple different sorted classes. Arrays can be sorted, sure, but I'm looking at ones that offer some form of insertion sort. For this, I'm looking at these three classes:

Collection Insert(Random) Insert(Sorted) Remove Indexed Fetch
SortedDictionary O(log n) O(log n) O(log n) O(log n)
SortedList O(n) ~O(log n) ? O(n) O(log n)
SortedSet O(log n) O(log n) O(log n) -

Runs

SortedDictionary and SortedList using custom comparer

          Runs    Dictionary    SortedDict    SortedList     SortedSet
Insert
         50000    26,668          460,026      3,823,552     210,012
         60000    33,335          566,699      5,513,648     253,347
         70000    53,336          683,372      7,717,108     326,685
         80000    53,336          863,382     12,737,395     340,019
         90000    50,002          896,718     13,194,088     410,023
        100000    83,338        1,066,728     19,034,422     460,026
        110000    90,005        1,156,733     22,917,977     526,697
        120000    90,005        1,306,741     27,244,891     650,037
Fetch
         50000    13,334        313,351       293,350
         60000    13,334        550,031       363,354
         70000    16,667        506,695       463,359
         80000    23,334        570,032       520,029
         90000    26,668        666,705       610,034
        100000    26,668        753,376       703,373
        110000    30,001        870,049       796,712
        120000    30,002        976,722       896,718

Part of the problem comparing all these collections is that don't compare in the same way. SortedDictionary and SortedList both default to comparing unique keys. If you can guarantee each item in the collection has a unique sorting key, then they work great. If you require more complicated sorting, such as comparing against a non-unique value within the TValue, then you add in the extra overhead of storage and lookup of the base collection. SortedSet works better for comparing TValues.

SortedDictionary and SortedList using default key sorter (insert in order)

          Runs    Dictionary    SortedDict    SortedList     SortedSet
Insert
         50000        30,003       246,691       130,005       160,013
         60000        33,336       356,702       126,679       273,360
         70000        36,670       353,368        73,340       310,031
         80000        56,672       413,374        80,008       416,708
         90000        66,673       466,713        90,009       486,715
        100000        70,007       503,383       103,343       576,724
        110000        80,008       553,388       113,344       666,733
        120000       103,343       646,731       180,018       653,398
Fetch
         50000        10,001       123,345        73,340
         60000        13,334       160,016        90,009
         70000        16,668       186,685       106,677
         80000        16,668       233,356       130,013
         90000        20,002       246,691       140,014
        100000        20,002       266,693       156,682
        110000        23,335       310,031       190,019
        120000        23,335       330,033       190,019

Using the default key comparer in order, SortedList performs optimally.

SortedDictionary and SortedList using default key sorter (insert in reverse order)

          Runs    Dictionary    SortedDict    SortedList     SortedSet
Insert

         50000        30,000       273,333    23,233,332       210,000
         60000        30,000       293,333    28,999,999       276,666
         70000        46,666       350,000    29,679,999       343,333
         80000        50,000       433,333    60,493,330       370,000
         90000        50,000       526,666    49,526,662       500,000
        100000        56,666       560,000    79,496,663       600,000
        110000        56,666       716,666   114,516,661       640,000
        120000        63,333       753,333    98,069,995       720,000
Fetch

         50000        10,000       123,333        73,333
         60000        13,333       150,000        93,333
         70000        13,333       176,666       106,666
         80000        20,000       200,000       123,333
         90000        20,000       243,333       140,000
        100000        23,333       263,333       153,333
        110000        26,666       293,333       173,333
        120000        30,000       323,333       186,666

Do the same run only with reversed keys and SortedList falls apart. SortedDictionary is most consistent with key comparisons (SortedSet still comparing values in these runs).

Notes

These are an average of three runs each. I'm not sure if I'm doing something wrong for the SortedList inserts or not - it's way above everything else. Otherwise for purely random data inserts, SortedSet seems to be the most consistent. If the data is mostly pre-sorted, then SortedList will beat them all, but then bombs with random data. Indexed fetch isn't available for SortedSet so it's not shown. SortedDictionary and List both perform and scale similarly with a custom comparer.

Repository for test code