Don’t use PDF.
Don’t use Word.
For heaven’s sake, don’t “make a deck”.
Just use markdown.
Agents can consume markdown so much easier than they can consume these other formats. Just use markdown.
Don’t use PDF.
Don’t use Word.
For heaven’s sake, don’t “make a deck”.
Just use markdown.
Agents can consume markdown so much easier than they can consume these other formats. Just use markdown.
I benchmarked equality, GetHashCode, and HashSet operations across different C# type implementations. The tests compared struct, readonly struct, record, and readonly record struct at various sizes (16-128 bytes).
The conventional wisdom has been “don’t have more than 16 bytes in a struct” with 32 bit CPU architectures, and no more than 32 bytes on 64 bit CPUs, but mostly the former. I wanted to see what the actual performance degradation was as struct size increases.
I like to use readonly record struct to model aspects of my domain knowing that the compiler will optimize the type away to just the properties it contains, while guaranteeing domain correctness.
Tests were run on an Apple M3 using .NET 8 with BenchmarkDotNet’s ShortRun configuration.
record is a value type, so increasing the number of properties doesn’t change the performance characteristics much, but I did it to keep the basis for comparison as even as possible.
| Type | 16-byte (ns) | 32-byte (ns) | 64-byte (ns) | 128-byte (ns) |
|---|---|---|---|---|
| Struct | 7.66 | 8.06 | 10.83 | 16.46 |
| Struct + IEquatable | 0.00 | 1.25 | 3.61 | 9.82 |
| Readonly Struct | 7.56 | 8.02 | 10.91 | 16.51 |
| Readonly Struct + IEquatable | 0.00 | 1.08 | 3.21 | 9.64 |
| Readonly Record Struct | 0.00 | 1.00 | 2.82 | 7.57 |
| Record | 0.94 | 1.27 | 2.55 | 6.32 |
| Type | 16-byte (ns) | 32-byte (ns) | 64-byte (ns) | 128-byte (ns) |
|---|---|---|---|---|
| Struct | 12.41 | 12.64 | 10.88 | 13.13 |
| Struct + IEquatable | 1.90 | 3.14 | 9.45 | 17.71 |
| Readonly Struct | 11.14 | 12.95 | 10.98 | 13.22 |
| Readonly Struct + IEquatable | 1.73 | 2.87 | 8.96 | 16.82 |
| Readonly Record Struct | 0.00 | 0.43 | 2.67 | 9.80 |
| Record | 1.33 | 1.79 | 4.31 | 10.82 |
| Type | 16-byte (ns) | 32-byte (ns) | 64-byte (ns) | 128-byte (ns) |
|---|---|---|---|---|
| Struct | 20.27 | 23.59 | 24.00 | 34.56 |
| Struct + IEquatable | 5.08 | 6.73 | 15.90 | 27.34 |
| Readonly Struct | 20.47 | 23.87 | 24.08 | 33.87 |
| Readonly Struct + IEquatable | 5.11 | 6.61 | 15.50 | 27.41 |
| Readonly Record Struct | 2.42 | 4.61 | 9.57 | 20.65 |
| Record | 6.14 | 6.89 | 10.51 | 19.20 |
| Type | 16-byte | 32-byte | 64-byte | 128-byte |
|---|---|---|---|---|
| Struct (Equals) | 64 B | 96 B | 160 B | 288 B |
| Struct (GetHashCode) | 32 B | 48 B | 80 B | 144 B |
| Struct (HashSet) | 96 B | 144 B | 240 B | 432 B |
| Readonly Struct (Equals) | 64 B | 96 B | 160 B | 288 B |
| Readonly Struct (GetHashCode) | 32 B | 48 B | 80 B | 144 B |
| Readonly Struct (HashSet) | 96 B | 144 B | 240 B | 432 B |
| All others | 0 B | 0 B | 0 B | 0 B |
IEquatable<T> use ValueType.Equals, which boxes and allocates, causing 7-16ns overhead per operation.ValueType.Equals is 3-5x slower.I benchmarked 14 common approaches to counting substrings in .NET. The approaches differed up to 60-70x in execution time with memory allocations ranging from zero to over 160KB.
Tests searched for substring “the” in two strings on an Apple M3 using .NET 8 with BenchmarkDotNet’s ShortRun configuration. The big string was the first chapter of The Hobbit, and the small string was the first 100 chars of the big string.
| Approach | Small (ns) | Large (ns) | Allocated |
|---|---|---|---|
| Span | 17.71 | 8,227 | 0 B |
| IndexOf (Ordinal) | 18.93 | 8,662 | 0 B |
| IndexOf (OrdinalIgnoreCase) | 20.47 | 10,463 | 0 B |
| String.Replace | 37.33 | 24,645 | 216 B / 87,963 B |
| Cached, compiled Regex | 127.17 | 40,968 | 560 B / 162,880 B |
| Instantiating a Regex inline | 416.44 | 49,698 | 2,528 B / 164,848 B |
| Static Regex (Regex.Match) | 154.42 | 50,996 | 560 B / 162,880 B |
| String.Split | 145.47 | 70,195 | 304 B / 111,058 B |
| IndexOf (InvariantCulture) | 1,216.64 | 523,154 | 0 B / 1 B |
| IndexOf (InvariantCultureIgnoreCase) | 1,314.57 | 534,426 | 0 B / 1 B |
| IndexOf (CurrentCultureIgnoreCase) | 1,329.19 | 536,436 | 0 B / 1 B |
| IndexOf (CurrentCulture – default) | 1,224.49 | 553,913 | 0 B / 1 B |
Allocated column shows small/large text allocations.
If you’re a backend or line of business developer modeling your domain, you probably want IndexOf with the Ordinal or OrdinalIgnoreCase comparer, depending on domain semantics.
FrozenDictionary offers faster reads but slower creation. Here’s when the trade-off makes sense.
Based on benchmark data, here are the break-even points where FrozenDictionary becomes worthwhile:
| Collection Size | Cache Hits | Cache Misses |
|---|---|---|
| 10 elements | 276 reads | 125 reads |
| 100 elements | 1,831 reads | 804 reads |
| 1,000 elements | 22,157 reads | 9,634 reads |
| 10,000 elements | 217,271 reads | 104,890 reads |
(Based on string keys and OrdinalIgnoreCase comparison.)
FrozenDictionary is much faster for failed lookups (0.33ns vs 6-7ns), so collections with cache misses justify the switch sooner.FrozenDictionary maintains good performance regardless of collection size.Switch to FrozenDictionary when you expect:
FrozenDictionary‘s creation penalty is substantial but decreases as collection size increases:
| Elements | Dictionary | FrozenDictionary | Overhead |
|---|---|---|---|
| 10 | 90.30 ns | 867.24 ns | 9.6x |
| 100 | 900.73 ns | 6,285.94 ns | 7.0x |
| 1,000 | 10,597.66 ns | 65,989.60 ns | 6.2x |
| 10,000 | 138,642.89 ns | 781,551.17 ns | 5.6x |
| Elements | dict Hit | dict Miss | Frozen Hit | frozen Miss |
|---|---|---|---|---|
| 10 | 5.48 ns | 6.54 ns | 2.66 ns | 0.33 ns |
| 100 | 5.77 ns | 7.04 ns | 2.83 ns | 0.34 ns |
| 1,000 | 5.45 ns | 6.08 ns | 2.95 ns | 0.33 ns |
| 10,000 | 5.64 ns | 6.49 ns | 2.68 ns | 0.36 ns |
My Logitech Brio stopped working after I upgraded from Monterey to Ventura. It’s always been connected to an OWC dock, along with a bunch of other peripherals. Maybe I can save you 15-20 minutes by sharing what I did:
And that was it. You can use Photo Booth to test at steps 3 and 4 to make sure it’s working along the way. I also rebooted, and everything stayed fixed after the restart.
Because I can never remember them.
Like many developers, I have collected a bunch of useful methods over time. Most of the time, these methods don’t have unit tests, nor do they have performance tests. Many of them have origins at StackOverflow — which uses the MIT license — and many of them don’t.
I started collecting them formally about two years ago. Recently I decided to actually turn them into something I could consume via nuget, because I was getting fed up with copying and pasting code everywhere.
Haystack targets .NET Standard 1.3, which means it works with:
Constant time string comparison matter in cryptography for various reasons. To that end, fast string comparisons can leak information, so we want to exhaustively check all the bytes in the string, even if we know the strings aren’t equal early on.
const string here = "Here"; const string there = "There"; var areSame = here.ConstantTimeEquals(there); // false
It’s useful to be able to remove substrings from the beginning and/or end of a string. With or without a StringComparer overload.
const string trim = "Hello world"; const string hello = "Hello worldThis is a hello worldHello world"; var trimFront = hello.TrimStart(trim); // This is a hello worldHello world var trimEnd = hello.TrimEnd(trim); // Hello worldThis is a hello world var trimBoth = hello.Trim(trim); // This is a hello world
The library is growing bit-by-bit, and contributions are welcome!
I use Dapper for most of my database interactions. I like it because it’s simple, and does exactly one thing: runs SQL queries, and returns the typed results.
I also like to deploy my schema changes as part of my application itself instead of doing it as a separate data deployment. On application startup, the scripts are loaded and executed in lexical order one by one, where each schema change is idempotent in isolation.
The problem you run into is making destructive changes to schema, which is a reasonable thing to want to do. If script 003 creates a column of UNIQUEIDENTIFIER, and you want to convert that column to NVARCHAR in script 008, you have to go back do some reconciliation between column types. Adding indexes into the mix makes it even hairier. Scripts that are idempotent in isolation are easy to write. Maintaining a series of scripts that can be safely applied in order from beginning to end every time an application starts up is not.
Unless you keep track of which schema alterations have already been applied, and only apply the changes that the application hasn’t seen before. Here’s a short, self-contained implementation:
public class SchemaUpdater { private readonly string _connectionString; private readonly ILog _logger; private readonly string _environment; public SchemaUpdater(string connectionString, string environment) : this(connectionString, environment, LogManager.GetLogger(typeof(SchemaUpdater))) { } internal SchemaUpdater(string connectionString, string environment, ILog logger) { _connectionString = connectionString; _environment = environment; _logger = logger; } public void UpdateSchema() { MaybeCreateAuditTable(); var previousUpdates = GetPreviousSchemaUpdates(); var assemblyPath = Uri.UnescapeDataString(new UriBuilder(typeof(SchemaUpdater).GetTypeInfo().Assembly.CodeBase).Path); var schemaDirectory = Path.Combine(Path.GetDirectoryName(assemblyPath), "schema-updates"); var schemaUpdates = Directory.EnumerateFiles(schemaDirectory, "*.sql", SearchOption.TopDirectoryOnly) .Select(fn => new { FullPath = fn, Filename = Path.GetFileName(fn) }) .Where(file => !previousUpdates.Contains(file.Filename)) .OrderBy(file => file.Filename) .Select(file => new { file.Filename, Query = File.ReadAllText(file.FullPath) }) .ToList(); foreach (var update in schemaUpdates) { using (var connection = new SqlConnection(_connectionString)) { try { var splitOnGo = SplitOnGo(update.Query); foreach (var statement in splitOnGo) { try { connection.Execute(statement); } catch (Exception exception) { Console.WriteLine(exception); throw; } } connection.Execute("INSERT INTO SchemaRevision (Filename, FileContents) VALUES (@filename, @fileContent)", new { filename = update.Filename, fileContent = update.Query }); } catch (Exception e) { _logger.Fatal(new { Message = "Unable to apply schema change", update.Filename, update.Query, Environment = _environment }, e); throw; } } } } public static ICollection<string> SplitOnGo(string sqlScript) { // Split by "GO" statements var statements = Regex.Split( sqlScript, @"^[\t\r\n]*GO[\t\r\n]*\d*[\t\r\n]*(?:--.*)?$", RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnoreCase); // Remove empties, trim, and return var materialized = statements .Where(x => !string.IsNullOrWhiteSpace(x)) .Select(x => x.Trim(' ', '\r', '\n')) .ToList(); return materialized; } internal void MaybeCreateAuditTable() { const string createAuditTable = @"IF NOT EXISTS(SELECT 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'SchemaRevision') BEGIN CREATE TABLE [dbo].[SchemaRevision] ( [SchemaRevisionNbr] BIGINT IDENTITY(1,1), [Filename] VARCHAR(256), [FileContents] VARCHAR(MAX), CONSTRAINT PK_SchemaRevision PRIMARY KEY (SchemaRevisionNbr), ) END"; using (var connection = new SqlConnection(_connectionString)) { connection.Execute(createAuditTable); } } internal HashSet<string> GetPreviousSchemaUpdates() { using (var connection = new SqlConnection(_connectionString)) { var results = connection.Query<string>(@"SELECT Filename FROM SchemaRevision"); return new HashSet<string>(results, StringComparer.OrdinalIgnoreCase); } } }
Update 2017-09-05: I added the SplitOnGo() method to support the GO delimiter, since I’ve had occasion to need it recently. It’s adapted from Matt Johnson’s answer on StackOverflow.
Update 2018-04-11: Most all of these changes have been published in ical.net versions 3 and 4. See the release notes for more details.
When I ported ical.net to .NET Core, I removed the ability to download remote payloads from a URI. I did this for many reasons:
HttpClient leaves .NET 4.0 users out in the cold. Choosing to support WebClient brings those people into the fold, but leaves .NET Core and WinRT users out. It also prevents developers working with newer versions of .NET from benefiting from HttpClient.async Tasks. Given the popularity of microservices and ical.net’s origins on the server side, this is a non-starter.We can’t satisfy all use cases if we try to do everything, so instead I’ve decided that we’ll leave over-the-wire tasks to the developers using ical.net.
To that end… strings will be the primary way to work with ical.net. A developer should be able to instantiate everything from a huge collection of calendars down to a single calendar component (a VEVENT for example) by passing it a string that represents that thing. In modern C#, working directly with strings is more natural than passing Streams around, which is emblematic of old-school Java. It’s also more error prone: I fixed several memory leaks during the .NET Core port due to undisposed Streams)
ToString() will be the serializer. It is reasonable for ToString() to serialize the typed representation into the textual representation.Constructors as deserializers buys us…
One of the challenges I faced when refactoring for performance was reasoning about mutable properties during serialization and deserialization. Today, deserialization makes extensive use of public, mutable properties. In fact, the documentation reflects this mutability:
var now = DateTime.Now; var later = now.AddHours(1); var rrule = new RecurrencePattern(FrequencyType.Daily, 1) { Count = 5 }; var e = new Event { DtStart = new CalDateTime(now), DtEnd = new CalDateTime(later), Duration = TimeSpan.FromHours(1), RecurrenceRules = new List<IRecurrencePattern> {rrule}, }; var calendar = new Calendar(); calendar.Events.Add(e);
To be completely honest, this state of affairs makes it quite difficult to make internal changes without breaking stuff. Many properties would naturally be getter-only, because they can be derived from simple internals, like Duration above. Yet they’re explicitly set during deserialization. This is an incredible vector for bugs and breaking changes. (Ask me how I know…)
If we close these doors and windows, it will increase our internal maneuverability.
Fluid API
Look at the code above. Couldn’t it be more elegant? Shouldn’t it be? I don’t yet have a fully-formed idea of what a more fluid API might look like. Suggestions welcome.
IICalendarTypeNames
The .NET framework guidelines recommend prefixing interface names with “I”. The calendar spec is called “iCalendar”, as in “internet calendar”, which is an unfortunate coincidence. Naming conventions like IICalendarCollection offend my sense of aesthetics, so I renamed some objects when I forked ical.net from dday. I’ve come around to valuing consistency over aesthetics, so I may go back to the double-I where it makes sense to do so.
CalDateTime
The object that represents “a DateTime with a time zone” is called a CalDateTime. I’m not wild about this; we already have the .NET DateTime struct which has its own shortcomings that’ve been exhaustively documented elsewhere. A reasonable replacement for CalDateTime might be a DateTimeOffset with a string representation of an IANA, BCL, or Serialization time zone, with the time zone conversions delegated to NodaTime for computing recurrences. (In fact, NodaTime is already doing the heavy lifting behind the scenes for performance reasons, but the implementation isn’t pretty because of CalDateTime‘s mutability. Were it immutable, it would have been a straightforward engine replacement.)
CalDateTime is the lynchpin for most of the ical.net library. Most of its public properties should be simple expression bodies. Saner serialization and deserialization will have to come first as outlined above.
VTIMEZONE
The iCalendar spec has ways of representing time change rules with VTIMEZONE. In the old days, dday.ical used this information to figure out Standard Time/Summer Time transitions. But as the spec itself notes:
Note: The specification of a global time zone registry is not addressed by this document and is left for future study. However, implementers may find the Olson time zone database [TZ] a useful reference. It is an informal, public-domain collection of time zone information, which is currently being maintained by volunteer Internet participants, and is used in several operating systems. This database contains current and historical time zone information for a wide variety of locations around the globe; it provides a time zone identifier for every unique time zone rule set in actual use since 1970, with historical data going back to the introduction of standard time.
At this point in time, the IANA (née Olson) tz database is the best source of truth. Relying on clients to specify reasonable time zone and time change behavior is unrealistic. I hope the spec authors revisit the VTIMEZONE element, and instead have it specify a standard time zone string, preferably IANA.
To that end… ical.net will continue to preserve VTIMEZONE fields, but it will not use them for recurrence computations or understanding Summer/Winter time changes. It will continue to rely on NodaTime for that.
URL and ATTACH
As mentioned above, ical.net will no longer include functionality to download resources from URIs. It will continue to preserve these fields so clients can do what they wish with the information they contain. This isn’t a divergence from the spec, per se, which doesn’t state that clients should provide facilities to download resources.
A few months ago, I needed to do some calendar programming for work, and I came across the dday.ical library, like many developers before me. And like many developers, I discovered that dday.ical doesn’t have the best performance, particularly under heavy server loads.
I dug in, and started making changes to the source code, and that’s when I discovered that the licensing was ambiguous, and that it had been abandoned. I was concerned that I might be exposing my company to risk due to unclear copyright, and a non-standard license.
With some effort, I was able to track down Doug Day (dday), and he gave me permission to fork, rename (ical.net), and relicense his library (MIT), which I have done. So I’m happy to report…
mdavid, who saw to it that the library wasn’t lost to the dustbin of Internet history, has graciously redirected dday users to ical.net. Khalid Abuhakmeh, who published the dday nuget package that you might be using (you should switch ASAP) has also agreed to archive and redirect users to ical.net.
So… why should you use the new package?
Doug has revoked his copyright, and given unrestricted permission to give dday.ical new life as ical.net. That means ical.net is unencumbered by legal ambiguities.
My changes to ical.net have been mostly performance-focused. I was lucky in that dday.ical has always included a robust test suite with about 170 unit tests that exercise all the features of the library. Some were broken, or referenced non-existent ics files, so I nuked those right away, and concentrated on the set of tests that were working as a baseline for making safe changes.
The numbers:
There’s no games here. ical.net really is that much faster.
Profiling showed a few hotspots which I attacked first, but those only bought me maybe 3-4 seconds improvement. There was no single thing that resulted in huge performance gains. Rather it was many, many small changes that contributed, quite often by improve garbage collection pauses, many of which were 5ms+, which is an eternity in computing time.
Here are a few themes that stand out in my memory:
Hashtable, ArrayList) to modern, generic equivalentsList<T> to HashSet<T> for many collections, including creating stable, minimal GetHashCode() methods, though more attention is still needed in this area. A nice side effect of this was that lot of lookups and collection operations then became set operations (ExceptWith(), UnionWith(), etc.)O(n^2) methods to O(n) or better by restructuring methods based on information that was available in contextList<T> and Dictionary<TKey, TValue>Foo!) inside a tight loop that’s only ever deserializing Foos. So you can make the call once and just reuse the deserializer.TODOs in the comments.Along the way, I converted a lot of code to modern, idiomatic C#, which actually helped performance as much as any of the discrete things I did above. As I work towards a .NET Core port, I have the runtime down to about 2.8 seconds just through clarifying and restructuring existing code, and idiomatic simplifications.