In my last post I discussed the pervasive issue of relational modeling as a (poor) substitute for proper domain modeling, the reasons why (almost) everyone continues to build software that way, and the resulting problems that arise for those with ambitions to move their relationally-modeled software system to the cloud.
Given the prevalence of RDBMS-backed enterprise software and the accelerating pace of public cloud adoption, chances are pretty good you’re faced with this scenario today. Let’s discuss what you can do about it.
Scenario: We Built Our App Against A Relational Model… But Now We Want It In The Cloud!
My first piece of advice is make sure you really need public cloud. The cloud has well-documented scalability, elasticity, and agility benefits, but most of those arise only for software designed intentionally to take advantage of them. They also tend to prove most cost-effective when amortized across a relatively long application life span, or across a relatively large (and, ideally, increasing) number of transaction requests. If yours is a legacy app with modest and relatively static resource needs, the cost justification for a public cloud deployment may not be so obvious.
It may, in the final analysis, not exist at all. Sure, you can leverage IaaS capabilities to spin up VMs and perhaps buy yourself a modest additional level of deployment flexibility. For some, that might be a reasonable and justifiable decision; perhaps capital expense constraints preclude reinvesting in updated hardware for the private data center. But let’s face it, a “modest improvement” in deployment flexibility is hardly the stuff of IT legend.
Instead of diving blindly into the cloud and/or immediately devoting resources toward a cloud-focused refactoring effort, first consider your reasons for wanting to adopt public cloud. If your application performs poorly on existing private data center hardware, the cloud isn’t pixie dust to wave over it and make it better. In fact, for some legacy codebases cloud hosting could degrade performance even further! You must understand what you’re dealing with; find a trusted software architect to assess existing issues with your application, determine root causes, and recommend mitigation strategies. Know where your big resource-hogging transactions are. Understand which tables in your current relational database are hotspots, and what you can do to fix that. Understand what happens front-to-back-to-front as users click through your web UI and wait several seconds or more for page navigation, data refresh, etc. In many cases, additional (if non-trivial) refactoring might be enough to squeeze better performance from your existing infrastructure. If yours is a multi-tenant architecture, consider sharding your database layer by customer to better accommodate heavy load. This may also put you in a better position to eventually migrate to the cloud at a more appropriate time.
Perhaps your legacy app runs well enough for current user demands, but you anticipate additional capacity needs in the near future. This isn’t a bad reason to consider public cloud; but are you sure your app will scale well horizontally? Recall that scale-out (horizontal scaling) is far and away the preferred scalability mechanism in the cloud; scale-up (vertical scaling) is also possible but tends to be very expensive in the cloud and is only a good solution in relatively narrow cases. If scale-up is your preferred (or only?) strategy, it’s entirely possible you can achieve your goals more cheaply by migrating to bigger iron in your private data center than by paying hourly costs for super-sized cloud VMs. Maybe… maybe not. Again, you may need to find a trusted resource to help you do TCO analysis of cloud and private data center options. Be informed. Don’t make knee-jerk decisions, in either direction.
So far it sounds as though I’m recommending against legacy app migration to the cloud. Far from it… done correctly and for the right reasons it can easily justify the needed time and resource commitments. If your due diligence indicates that a move to the cloud makes sense for your legacy application, your next step is to determine whether the IaaS or PaaS hosting model makes sense for you (I’ll skip over the choice of public cloud provider, for now anyway).
We can keep this short: you’re almost certainly going to want to start with IaaS virtual machines. PaaS hosting is a great model for applications intentionally built to leverage the cloud and comfortable with the constraints imposed by the PaaS provider of choice (usually things like choice of technology stack, deployment options, etc). PaaS can have a superior TCO story relative to the IaaS hosting model. But as you’re starting with a legacy application that wasn’t built for the cloud, PaaS may not be a deploy-and-forget scenario for you. To leverage PaaS you’ll need a horizontally scalable application that minimizes dependencies on custom OS configuration and file system access (custom Windows registry keys and reliance on specific or deep file system hierarchies are all red flags). You’ll need to understand the configuration, monitoring, and authentication/authorization subsystems of your PaaS provider and ensure your software plays nicely with those. You’ll also need to enable access from your PaaS-hosted app to your database; many PaaS solutions provide such behavior but only for a limited subset of technologies and configurations. Non-standard setups either require manual configuration or might simply be disallowed. In summary, a PaaS-hosted application is a cloud-ready application; getting your legacy app to that state is going to involve some effort. Thus IaaS will usually be a better starting point for existing apps.
Of course, VM hosting in the cloud can help you more easily scale up your existing relational database to the limits of your data size and query patterns (and budget ) and help you correspondingly scale up/out your application tier. To the extent this gives you more flexibility than you would have otherwise had in your private data center, this is a good thing. But if your resource needs continue to grow, you’ll eventually end up needing to refactor your database and your code to scale horizontally. Where to start?
Here’s where proper domain modeling is needed. There are a number of techniques for this; my personal favorite is Domain-Driven Design. Set aside your existing software for the moment and build a conceptual model of the business problem its solving... something a user of the system would understand. Know that this is no easy task; prepare to spend time and money to do it right. But the overarching goal is two-fold. First, develop this conceptual picture of your business problem independent of any storage-centric model used to represent data. This will give you technology-agnostic building blocks through which you can sort through challenging aspects of the system. Second, subdivide that large conceptual model into smaller, easier-to-digest sub-models that describe discrete processes within the larger problem domain. In DDD these sub-models are known as bounded contexts; they provide ambient, point-in-time meaning for actions and state changes within the system (you’re not just adding line items to a purchase order, you’re building the user’s shopping cart and preparing it for processing) and they’re bounded by a primary reason for existence (sort of a modeler’s take on the Single Responsibility Principle). A single conceptual model will consist of many such bounded contexts.
Bounded contexts become the unit of transactional consistency within a larger system; that is, state changes within a bounded context should be encapsulated within a single database transaction. Thus the use of bounded contexts tends to largely eliminate a big problem of monolithic relationally-modeled applications: the tendency for state changes to cascade across large swaths of the relational model, which results in very large (and slow!) transactions. Of course, the need will arise to propagate changes from one context to another; in DDD this often occurs as a formally modeled state change event (“item X added to shopping cart”) which can be handled by any other context which previously registered interest in such an occurrence. The handling of these events occurs beyond the scope of the initiating transaction, however. For more information on how this works, read up on eventual consistency (or drop me a line!).
The use of bounded contexts has an interesting side effect for cloud-based systems. By subdividing your application model into smaller, self-consistent sub-models with small transactions, you make it possible to deploy storage independently for each context. In other words, you get horizontal scale of your database as a fundamental part of your design, which is the fundamental building block of a cloud-native application.
In practice, multiple bounded contexts will store their state on the same machine (even the same database instance); but there’s nothing stopping you from using multiple database machines or even multiple database products across multiple machines, if the need arises from a scalability perspective. In fact, that’s precisely how cloud-hosted data stores scale to handle request levels far beyond what was considered “big” even five years ago. In the cloud, you don’t need (or even want) One Big Database.
So back to your existing application. Once you’ve done some domain modeling and subdivided that into manageable sub-models, consider the work you did to identify performance bottlenecks and database hotspots, and map that list of issues to the sub-models you created. Granted the lines won’t always be neat and clean, but the goal is to prioritize sub-models that you can migrate from the existing relational model to their own self-encapsulated ones. Start with the highest priority items and work your way down the list. With time and patience you can convert that legacy, monolithic enterprise application into a horizontally-scalable, cloud-ready fire breather. Just realize that there is no silver bullet here; application migration to the cloud is a process that rewards steady persistence above all else.
A minor disclaimer: clearly I’m glossing over the details of this process, and focusing principally on the database layer. This is intentional, but it does omit other interesting and relevant topics like security, integration, deployment mechanics, legal and economic considerations of the cloud, etc. I haven’t even scratched the surface of all DDD has to offer. Those are fun discussions too . Ping me if you’d like to talk specifics of your situation.
Next time, I’ll talk about ways to avoid this whole mess if you’re writing a new cloud-targeted enterprise app… we have the technology!
For CTOs looking to squeeze new life out of legacy enterprise applications, the cloud offers tantalizing prospects. Pushbutton scalability, reduced capital costs, outsourcing of non-core IT functions, potentially greater monitoring and health management capabilities and almost certainly greater overall uptime… even with the potential downsides, its no wonder senior management is tempted.
And yet those downsides are more than nagging problems; in many cases they pose significant barriers to a successful move from private data center to public cloud. An existing enterprise app might work fine running on internal hardware, with a modest user base… but move it blindly to a VM in Azure or AWS and suddenly that clever little accounting app grinds to a halt (as does your business). But why is that, exactly?
Where’s The Rub?
There are many potential difficulties to overcome when migrating an existing enterprise app to the cloud: reliance on past-generation (read: potentially unsafe) database or file-system drivers that might be ill-suited (or incompatible with) your chosen cloud stack, legal or regulatory requirements that mandate where the data lives, preconceptions about ambient hardware or network infrastructure baked (inadvertently, or otherwise) into your software, security contexts or sandbox privileges required for successful operation that may not be recommended best practices in the cloud, etc. Any of these (and many more) can trip up a migration effort. But there’s one incompatibility so pervasive that its worth discussing further, on its own. Its origins largely pre-date cloud computing itself, in fact. But the negative effects haunt us now, and we’ll likely continue to deal with them for years to come.
It’s your application’s data model.
It’s not the data itself… even if you’ve got a lot of that, there’s plenty of room to store it all in the cloud, if you want. And it’s not the application code per se, though it’s likely that you’ll need to change at least some of that to maximize the full potential of your cloud-hosted application.
No, what I’m talking about is the original conceptual model used to define the database underpinning your application. This model was probably created a long time ago… perhaps you paid a lot of money to a database architect who studied your requirements and used tools like Erwin or ER/Studio to make complex graphical depictions of tables and relationships, or maybe the model was defined by developers in code using APIs like Entity Framework or NHibernate. In either case, you could likely sit down with a developer on your team and have them walk you through the model, and you’d see elements of your business domain that you recognize… a Customer table, defined relationships between an Address and a Warehouse, etc. And this would seem logical and reasonable to you… the application performs some vital function for your business, as part of that it manipulates data, that data needs to live somewhere… voila! Here it is… in the database, created from this model.
A Minor Assumption, With Major Implications
The problem is that this model almost certainly has one very big assumption baked into it… it assumes there will be one physical database created from the model, and that all the data will live there. It is by definition a relational model… the concepts modeled within and their relationships (their “referential integrity”) can only be reliably maintained if the data is reasonably co-located, such that the database process can enforce transaction boundaries, data staleness and visibility rules, update query indexes as data changes render them obsolete, process complex joins across multiple tables, etc. Relational databases do not (cannot) reliably do these things across multiple machines. For a more detailed, nerdy explanation of why this is so, see CAP theorem.
In short, a relational model is predicated on the existence of One Giant Database. And unfortunately, sooner or later that single machine is doing as much work as it can do, but you need more. And now you have One Giant Problem.
To be clear, this isn’t really a cloud-specific issue. Any computational- or data-intensive resource (cloud-based or not) will eventually saturate. At that point, you have two options: scale up (buy a bigger server) or scale out (buy more servers). If the resource in question is an application server, either option (assuming your application architect is competent and anticipated scale-out scenarios) can work. But if the resource is a traditional relational database, you really only have two options: scale up and hope it’s good enough, or re-architect for scale-out. Sharding is sometimes a possible third option, sometimes a manifestation of the second… it has it’s place, but also enough drawbacks to make it unsuitable for the general case.
Scale up… or out… or ?
So for relational database scalability issues, scale up is usually the first consideration. More memory, more processors, more and faster disks… these will help your application serve more requests and handle more users, for a time. But you eventually bump up against the laws of physics. There is only so much RAM, CPU, and disk I/O you can bake into a database machine (physical or virtual, cloud-based or not). And even if your data access needs are within reach of the current technological state of the art, they may not be within reach of your budget (a quick perusal of AWS hosted database pricing shows a range from less than 2 cents/hour to over $7.50/hour… cha-ching!).
And so you’re left with the option of re-architecting for scale out. Scale out has two significant advantage over scale up. First, it’s theoretically unbounded; you can keep adding more servers forever. In practice this isn’t even remotely true, as life and software architecture inevitably intrudes and poses some actual upper bound. But still, it’s reasonable to say that a properly architected enterprise application can scale out much, much further than it can ever scale up. The second advantage is cost; scaled out solutions can be done incrementally, and with commodity hardware. This affords you the opportunity to purchase as much scalability as you need at any given moment in time. Scaled up solutions require the use of ever-more-expensive hardware resources, and perhaps worse, necessitate that existing resources must be retired or repurposed… with scale up, you can’t aggregate hardware to work cooperatively (which is exactly what happens in scale out).
But the big disadvantage of scale out is that you have to plan for it, architect for it, and choose technologies that enable it. And there’s the core issue with relational models and scale out; a relational model, and a database created from it, and likely the code written to work with it, are all fundamentally incompatible with any plan to scale out arbitrarily (darn that referential integrity!). Something will have to change, and that something will cost you time and money. There are limited options in products like SQL Server and Oracle for clustering a handful of machines together, but these tend to be used more in service to failover/reliability than pure scalability/availability needs.
A Storage Model Is Not a Domain Model
So, fine then… relational databases are incompatible with the preferred means of scaling cloud-based software (meaning, scale out). Relational models are poor but frequently used tools for modeling business domains, with significant negative implications for future scalability of the affected applications. But how did this happen? Didn’t we see this coming?
Sure we did. For years, smart people have implored us to stop using (relational) database models as the blueprint for software implementing non-trivial business processes. We just didn’t listen. Our tools (cough Entity Framework cough) make it easy to go from database to code, and while things like EF code first provide us with other modeling alternatives, many applications are still constructed bottom-up from a relational database model. Guess what? If you start with a monolithic relational model and auto-generate EF code to talk to that model, your EF code isn’t any more cloud-ready than your database is (to be clear, I like EF and think it’s entirely appropriate for use on constrained subsets of an otherwise large model, even in the cloud… it’s the naïve use of huge, monolithic EF models that I object to).
“But we’ve always done it this way.” Sure we have. In fairness, that’s not entirely our fault… the skills and tools needed to create a proper domain model independent of a dedicated storage model have for various reasons not yet gained broad traction. The path of least (initial) resistance is to start with a database and build upwards. Legions of enterprise developers have written code like this for years (decades?) and still do. I like to think we’re slowly moving beyond this, and I have high hopes for things like Domain-Driven Design, microservices architectures, and polyglot persistence as some of the practices and patterns that will help us break the cycle. More on that in my next post. But for now, we’re still a long way from industry-wide enlightenment.
Your Technical Debt Is Now Past Due
We’ve kicked this relational modeling can down the road for a long time, because we could. In a world of small private data centers, modest departmental application needs and manual, Excel-driven business processes, relational databases with relational CRUD-style code on top and built from relational models are not always great but are often good enough. It’s when our ambitions grow, and our anticipated use of these creaky enterprise apps grows along with them, that our best laid plans face the harsh reality of the technical debt we’ve incurred.
You want to move your IT infrastructure to the cloud? You want a more elastic, robust, flexible, agile infrastructure upon which to run your business? That’s a valuable goal. The cloud can give you that, and more. But make plans to retire that technical debt first.
In my next post, I’ll explore ways to do just that… we’ll talk about migration strategies for existing applications, and also touch on ways to minimize that technical debt in the first place.
I’ve been working off and on for the last few months on a Windows app that I intend to publish soon in the Windows Store. Called “Pic Me,” the app sprang from a question my daughter asked me one night: “Dad, can you write an app that makes it easy to see all the photos I’ve been tagged in on Facebook and also lets me download those photos?” It sounded like a terrific idea, so I started laying down some code. I decided to make it a universal Windows app so it would run equally well on desktops, tablets, and phones. V1 is almost ready to submit for certification, so I thought I’d share it, source code and all, in case you’re interested in seeing how it’s put together and are interested in authoring universal apps of your own.
The screen shot below shows how Pic Me looks on a tablet and on a phone after I logged in using my Facebook credentials. (Try it yourself to see which photos you’ve been tagged in!) The login is accomplished using WinRT’s awesome WebAuthenticationBroker class, and if, after logging in, you want to log in as someone else to see what photos they’re tagged in, simply use the Switch User command in the command bar. On a phone, tap the ellipsis (the three dots) in the command bar to show the Switch User command, and on Windows, drag the command bar up from the bottom of the screen or display it by right-clicking on the screen or pressing Windows-Z on the keyboard.
Both versions of the app – the Windows version and the Windows Phone version – use a grouped GridView control to show the photos you’re tagged in and the year in which the photos were posted. Among other things, this highlights the awesomeness of the fact that GridView is now supported in Windows and Windows Phone. I templated the controls slightly differently to optimize for the form factor, but it’s same basic code and XAML working in both instances. If you tap a photo on Windows, an enlarged version of the photo appears in an overlay; do the same on the phone, and you navigate to a page that shows a detail of the photo, complete with information about who posted it and when. If the photo is accompanied by a caption, a downward-pointing arrow will appear in the header, and you can tap it to view the caption.
To save a photo to the local device, tap the Save button (the one with the disk icon) in the command bar. But you’re not limited to downloading one photo at a time; if you’d prefer, you can download all (or several) of them at once. Just go back to the main screen and select the photos you wish to download (see below). On a desktop or tablet, use a right-click or a vertical swiping motion to select a photo; on a phone, tap the Select button in the command bar to enter selection mode, and then tap each photo you wish to download. (On both platforms, you can use the Select All and Clear All buttons in the command bar to select or deselect photos en masse.) Once all the photos you want to download are selected, tap the Save button and all will be downloaded to the destination folder of your choice.
Pic Me employs several interesting techniques and best practices for developers interested in learning to write universal apps. For one, it uses a novel code-sharing technique based on partial classes. The Windows project and the Windows Phone project each have a MainPage.xaml file and a MainPage.xaml.cs. But the shared project has a MainPage.xaml.cs, too. The shared MainPage.xaml.cs file contains code that is common to both apps, while the others contain code that is app-specific. I could have put everything in the shared MainPage.xaml.cs and #iffed the heck out of it to separate platform-specific code from shared code, but that seemed like a bad idea from a maintainability standpoint. So I leveraged C#’s support for partial classes and achieved a much cleaner code separation.
Another point of interest for developers is the app’s use of WebAuthenticationBroker, SaveFilePicker, and FolderPicker. These classes are implemented in WinRT in Windows and on the phone, but they work very differently on the two platforms. Specifically, brokers and pickers on the phone rely on a continuation pattern that means the app is deactivated (and possibly terminated) while a broker or picker is displayed. Jeffrey Richter blogged about this a while back and offered some cool helper classes to abstract the differences. Among other things, the continuation pattern means you’d better be serious about writing code to save and restore the state of the app in the event of suspension and termination, because if your app is indeed terminated while a broker or picker is displayed, the user is going to be pretty unhappy when the app is reactivated.
You can download the source code for Pic Me from my OneDrive. Be aware that the source code will probably change some as I do further testing and make last-minute tweaks, and I still need to come up with some unique imagery to brand the app. (When the final code is ready, I’ll refresh the download so you’ll have the latest and greatest bits.) Meanwhile, if you’re a developer, let me know if you find any bugs. I’ve spent quite a bit of time testing the edge cases (e.g., your Internet connection failing at just the wrong time, or an OAuth token expiring in the middle of a series of calls to Facebook’s cloud APIs) and working around quirks in the API (why in the world does WebAuthenticationBroker on Windows throw a FileNotFound exception if you fire it up without an Internet connection and click the Back button to back out of it?), but it’s always possible I’ve missed something, and I’d appreciate hearing about it if I did. I’m interested in usability feedback, too, although I should caution you that I’ve done extensive usability testing with two experts: my daughters!
I am a huge believer in ongoing education. In fact, I regularly enroll and complete university MOOC courses that have nothing to do with software engineering (currently enrolled in 2 active courses and just recently completed another 2). I typically enroll in the edX courses (nearly all of which can be audited for free): https://www.edx.org/.
But I also really believe in (proactive) ongoing education throughout one’s career. Sure, it’s nice to have an employer who is willing to foot the bill for regular classroom training (we offer that too, by the way), but even without access to a classroom we all still have ample access to a wide range of career-centric education options – including the large (and growing) catalog of developer-focused online training material available through the WintellectNOW platform.
In the past I’ve mostly watched the occasional training videos in my spare time, but now I’d like to try something a little different and more deliberate. My plan is to pick two videos every week or so from our growing catalog, watch them over the weekend, and then post a quick review here of the highlights of those videos. If you would like to follow along but don’t have a WintellectNOW subscription then you can activate a trial using the code ROME-13 during registration.
For this first week I will be starting with:
I’ve not worked with either of those technologies before, so this should be a good use of a couple hours of my time this weekend!
BTW, if you are curious, the MOOC courses I am currently taking are:
- MITx: 6.002x Circuits and Electronics https://courses.edx.org/courses/MITx/6.002_4x/3T2014/info
- UTAustinX: UT.3.02x Age of Globalization https://courses.edx.org/courses/UTAustinX/UT.3.02x/3T2014/info
These two were completed recently:
- ANUx: ANU-ASTRO2x Exoplanets (really loved this one!!) https://courses.edx.org/courses/ANUx/ANU-ASTRO2x/2T2014/info
- CornellX: Astro2290x Relativity and Astrophysics (this one was extremely difficult but worth it) https://courses.edx.org/courses/CornellX/Astro2290x/1T2014/info
I have been working with Azure Storage Tables for many years now. During this time, I have learned many good practices and have also experienced many bad practices. So, over the past few months, I decided to write a document so I can share my experience with others. I call this document Jeffrey Richter's Guide to Working with Azure Storage Tables via C# and other .NET Languages. You can download my guide from the Wintellect website and you can learn more about Azure Tables by watching my video on the WintellectNOW website. My Guide has been reviewed and is endorsed by Microsoft’s own Azure Storage team. The guide has several purposes:
- Help developers improve their mental model with respect to Azure Tables
- Discuss the good and bad parts of Microsoft's .NET Azure Storage client library
- Show good patterns and practices related to Azure Tables Introduce my own (free) .NET Azure Storage client library which increases programmer productivity. The library offers many features to assist developers working with Azure Storage. For example, it offers blob logging features and a periodic elector which uses blob leases to elect a single VM to perform a periodic task (like backup storage data or to produce a weekly report). For tables, it offers many features including backup/restore, optimistic concurrency, easy filter construction, simple segmented result processing, property replacer/changer, pattern for extensible entity schemas, collection to/from property serializer.
I hope users of Azure Storage find my Guide and its accompanying class library useful.
I just got home from Devlink (unfortunately I had to bail out a day early) but I wanted to take a moment to say how impressed I am with the event, facilities, staff, and most important… the content! There were some excellent sessions throughout the week and my only regret in giving two talks of my own is that it left less time to soak up knowledge from everyone else. This was my first Devlink… it definitely won’t be my last. Kudos and sincere thanks to John Kellar and the Devlink board for putting on a great conference.
I had the pleasure of delivering two talks… “Node.js for .NET Developers” and “AWS vs. Microsoft Azure”. Both had great audience engagement and were lots of fun to deliver. I also did tag-team delivery of the all-day Microsoft Azure Pre-Con session with fellow Wintellectual John Garland, himself a fountain of Azure knowledge and all-around smart dude. It’s almost enough for me to forgive the fact that he’s a Florida Gator. Almost.
If you’re interested in the slide deck for my Node.js for .NET Developers talk, it can be found here. Likewise, the deck for my AWS vs. Azure talk is here. If you enjoy reading through them or have questions/comments/feedback, drop me a line at [email protected]. Always happy to talk Node, cloud, and other fun stuff.
I did an online session today celebrating the 25th anniversary of our Partner Infragistics. During the session, there were a lot of questions I was unable to answer because we ran out of time. Below are those questions and my responses (in italics). Many of these questions are answered by my various WintellectNOW videos. You can register for a free 14-day trial here.
1. Are memory leaks reflected in the used memory statistic of the task manager?
No, Task Manager doesn’t offer the best column for this. Use PerfMon.exe and watch a Process’ Virtual Bytes.
2. As a Windows OS advocate, I am curious on how Threading is implemented in Unix/Linux.
I answered this question on the call.
3. Can a thread ever run (or be set to run) for longer than one quantum?
Can you increase the time quantum for long running tasks
Not directly. You can raise a thread’s priority so it prevents lower-priority threads from running.
4. Can you point out the difference between threads and tasks?
A Task queues an operation to the thread pool. The thread pool then has one of its threads perform the operation. The thread pool threads are re-used over and over again to process all the queued operations. This reduces memory consumption and improves performance.
5. Could you elaborate more on the logical vs. real CPUs?
A physical CPU can perform one operation at a time. But sometimes, the physical CPU must pause and wait for RAM to complete some work. This causes the CPU to sit idle. Hyper-threaded CPUs can execute another thread during these pause times to improve overall system performance.
6. File Open Dialog was a common windows UI Control which exhibits this leak. Are there Windows APIs that tend to "leak" as well?
When you take a dependency on any technology (Microsoft or non-MS), you inherit its performance and efficiency problems. But, you saved yourself some time and energy. As a software developer you are charged with considering this tradeoff and determining if it is worth it for your application and your customers. Also, note that performance and efficiency are moving targets; that is, they change over time with later versions of the technology. With later versions things can get better or worse. So, when you take a dependency on some technology, these are the things you must be thinking about.
7. Given the advantages of using the thread-pool wouldn't it make sense to not allow the developer to create explicit threads and only provide access to "tasks" via the Win API?
Yes, in fact, the Windows Runtime (WinRT) API does not offer any functions allowing you to create threads; you MUST use the thread pool.
8. How are threads from background processes scheduled?
I’m not sure how you exactly define a “background process.” But, for the most part Windows schedules threads in a round-robin fashion without regard to which process the threads are in.
9. How could I make OS to schedule the most of CPU time working for an application?
This is dangerous thing to do and is discouraged. If the app goes into an infinite loop, then the rest of the system suffers greatly. However, you can raise the priority of threads in a process.
10. How does a thread pool help and how many threads should a pool have?
Thread pools help because they create threads and re-use them over and over again. This saves time because they do not constantly create and destroy threads. In addition, the thread pool knows how many CPUs the PC has and tries to create 1 thread per core to reduce context switching; this also improves performance.
11. How is the new Task class implemented? I had heard it was lighter weight than the Thread.
A Task is a small object in memory that knows how to queue a callback method to the thread pool. A task has no threads of its own. The task can monitor the lifetime of the queued item: did it complete? Did it thrown, did it return a value, etc.
12. How to interpret the benefit of parallel computing (increased performance) related to this thread waste of resource?
Most PCs can easily handle the allocation of a few MBs of memory in order for you app to take advantage of parallel processing. If you just strive for no more than 1 CPU per logical core, then memory consumption will stay low and performance will stay high.
13. If a quantum is 30 ms - how long does the context switch take?
How much time does it take the OS to do a context switch (i.e. what is context switch overhead relative to the size of a quantum)?
The time for a context switch various based on many factors: CPU speed, CPU architecture, and so on. But, what makes the performance even worse is that the CPUs cache is usually invalid after a context switch causing a lot of cache misses when accessing RAM.
14. If it is beneficial to create less thread as possible then what strategy should we choose when creating a WPF responsive Application? Any pattern or approach?
Typically UI apps (like WPF) have 1 GUI thread that process all user-interface events. Then, you queue up computationally-intensive work to the thread pool (via a Task) allowing the UI thread to respond to user input.
15. Jeff, even today we see some situations where we are not able to bring up task manager when the system is extremely busy.. why is this happening
It’s hard for to know for sure without being in front of the machine. But, my guess is that there may be some high-priority threads that are preventing Task Manager from displaying. This can sometimes happen with a bad device driver too.
16. Question set about duration of quantum, which you've indicated is 30ms - Has it always been 30 ms? What governs this duration? The HAL/clock interval?
Yes, the PC’s clock interval. There is a Win32 function that returns this info: GetSystemTimeAdjustment. Look at the lpTimeIncrement return value.
17. So are threads automatically recycled after a certain period of not being used?
If the thread are thread pol threads, then yes. If they are not thread pol threads, then no.
18. Threading topics apply to Azure programming?
Yes. Azure just creates virtual machines with Windows or other operating systems running in them. So information about threads apply to these VMs as well.
19. ThreadPool.GetAvailableThreads() show the answer as 1024 instead of 8. How is that possible when the number of cores in my PC are just 8?
In the remarks section for this method, it says that it returns the “number of additional worker threads that can be started.” This says that the thread can create this menay threads but not that it has actually created them.
20. What happens if you open another File Save As? Will more threads be orphaned?
No, it reuses the threads it created previously.
21. What is difference between Foreground and Background thread.
The .NET CLR kills a process as soon as all its foreground threads have stopped running (instantly destroying any background threads). So, foreground threads keep a process running while background threads do not.
22. What is fiber support in case of threading
Fibers a light-weight threads with the OS kernel doesn’t know anything about. The developer must write code to “context switch” from one fiber to another. Each fiber does have its own user mode stack but all fibers (on the same thread) share a kernel mode stack. .NET doesn’t support fibers and probably never will.
23. What is the performance effect of having multiple CPUs with multiple cores vs a single CPU with an equivalent number of cores?
For the most part, the perf would be the same. Sometimes, CPUs have to communicate with each other (like when taking a thread synchronization lock) and this communication is faster if the CPUs don’t have to talk through the bus on the motherboard.
24. How does async/await relate to threading?
These C# and VB language features allow you to perform I/O operations without blocking threads. This reduces the number of threads an application/service needs decreasing resource consumption and improving performance. I explain the value of all this in the video available here.
25. Is it always 30ms? Even in Windows 8.1?
On all versions of Windows to date, the timer interrupt fires at the same rate.
26. How do we get the additional threads started by File Open Dialogs (and other such controls) returned back to Thread Pool ?
You can’t control what a component does.
27. So those logical processors have their own CPU cache too? Otherwise, won't that impact performance?
The logical processors typically share the CPU’s cache. This is usually good as it allows bytes to be read once into the cache and shared by the other logical processors.
28. Is it wasteful to switch contexts if all the threads on the system are waiting most of the time?
No. But, if a thread does not want to wait, then allowing it to run without context switching is faster than introducing context switching.
29. Why is 32-bit arm about 1/2 of 32-bit x86 for overhead for memory for thread kernel object? Less data? More packing?
ARM CPUs have fewer CPU registers.
30. My question was: will there ever be such a concept supported by the .NET runtime, like the BEAM (Erlang virtual machine) threads, which said to be far more efficient and less resource expensive than threads, quicker context switch.
I’m not familiar with BEAM threads and what this actually means. I assume that they are like fibers and it is very unlikely that .NET will ever support fibers.
For the past several months, Wintellect has been re-architecting the http://WintellectNOW.com website. We built a new storage model that separates users from accounts. This allows us to accommodate our corporate and enterprise customers better. For example, with the new system, an account can have multiple administrators allowing multiple people within a company to control who can and can’t watch videos. We’ve also been working on a much improved user experience so watching and learning from all our videos is much cleaner and faster. We haven’t launched these improvements publicly yet; but hope to in the next few months.
However, on August 23rd 2014, we did launch the new site internally for testing. Since we require SSL across the whole site, we needed a certificate and since the site will ultimately be deployed at http://WintellectNOW.com, we decided we’d use that certificate. WintellectNOW runs as a Microsoft Azure Cloud Service and so we went to our certificate authority’s website and re-keyed our WintellectNOW SSL certificate and uploaded it to our testing site: so far, so good.
A few days later, some customers contacted us telling us that the public WintellectNOW website was unreachable. Their browsers were reporting to us that there was something wrong with our certificate. Once we got the reports, we tried accessing the site from several devices. All mobile phones could access the site perfectly while browsers running on desktop computers reported certificate errors. Some of those browsers, such as Internet Explorer, refused access to the site, but others, like Safari, let us continue to the site after reporting the certificate error. Even more interesting is that some browsers just said there was a problem with the certificate. The failure inconsistencies and the fact that we didn’t touch the public-facing site made the whole thing quite puzzling.
It took us about 30 minutes (with some more experimentation) to reason it out. Our certificate authority revokes a certificate when you re-key it. Apparently, not all certificate authorities do this but ours does. And, not all browsers honor this, but some do. I would have expected a “NOTE: Re-keying a certificate revokes your current certificate.” in big bold red letters to be on the CA’s webpage where you do the re-keying but no such text is there. Now that we knew what was wrong, it was really easy to fix it: we uploaded the new certificate to the public WintellectNOW website and then uploaded a new Azure Service Configuration (.cscfg) file to our role instances and the site was back again working for everyone within 5 minutes.
While this was a fire-drill that had several Wintellectuals working on the problem, it was a great lesson to learn (which is why I wanted to share it with this blog post). And, it was also reassuring to see how well we handled an unexpected problem as a team and how quickly we were able to resolve the issues for the customers who were experiencing problems.
We are very excited about the WintellectNOW website as is stands today and about our plans for its future. If you’d like to explore our deep, rich technical contact for two weeks free, register using this link: https://www.wintellectnow.com/Account/Promo/JeffreyR-2013.
On a side note: We license and optionally manage the WintellectNOW website to other companies who use it as their own video delivery platform. The architectural changes we’ve been making to the site helps these companies too. Please contact Wintellect if you’d like to use the WintellectNOW website for you own company’s video delivery system.
Attendee registration is now open for the 2014 Atlanta Code Camp: https://atlcc2014.eventbrite.com
Code Camp is your opportunity to join other like-minded developers from Atlanta and the greater southeast region for a full day of training on a range of modern technologies. It is an annual volunteer-driven community event, and attracts a wide range of experts on many topics including Windows, .NET, web development, Azure, mobile, design, and many others. The Atlanta Code Camp is always on a Saturday, so you don’t need to take a day off from work to attend.
This year’s event is being held on Saturday, October 11th, at Southern Polytechnic State University in Marietta GA. We are collecting a small fee ($10), which mostly covers the cost of lunch (which is provided).
Register today, because space is limited and tickets are available on a first-come, first-served basis! Once we sell out we can’t let anyone else through the door (due to safety rules at our venue)!
Interested in presenting one or more talks at this year’s event?
We are still accepting submissions. Please submit a topic for consideration using the form found here: http://www.atlantacodecamp.com/2014/CallForSpeakers
Are you (or your company) interested in becoming a sponsor for Code Camp?
We are still seeking additional sponsors! Sponsorship is a great way to get your company name in front of hundreds of motivated software engineers. If you or your company would like to sponsor this year’s event, then please visit our sponsorship page here: http://www.atlantacodecamp.com/2014/Sponsoring
For more information, please email the Code Camp team at [email protected]
Atlanta Code Camp main website: http://www.atlantacodecamp.com/2014/