Troubleshooting Windows Azure Applications

Typically on Windows Server Applications, troubleshooting is done by turning on IIS logs and event logs. These logs will survive restarts and developers examine them when a problem occurs. The same process can be followed in a Windows Azure application if remote desktop is enabled. Developers can connect to each instance to collect diagnostic data. The collection can then be done by simply copying the data to a local machine. However, this process is time consuming and will fail if the instance is reimaged. Also, it becomes quite impractical when dealing with many instances.

Windows Azure Diagnostics Windows Azure Diagnostics (WAD) provides functionality to collect diagnostic data from an application running on Windows Azure and store them on Windows Azure Storage.

Setup Collection The easiest way to setup WAD is to import the Windows Azure Diagnostics module to the application’s service definition and then configure the data sources for which diagnostic data is to be collected.

<ServiceDefinition name="TroubleShootingSample" xmlns="" schemaVersion="2012-05.1.7"> <WorkerRole name="WorkerRole1" vmsize="Small"> <Imports> <Import moduleName="Diagnostics" /> </Imports> </WorkerRole> </ServiceDefinition>

RequestNotifications. CGI. StaticFile. Module" verbosity="Verbose" /> </traceAreas> <failureDefinitions statusCodes="400-599" /> </add> </traceFailedRequests> </tracing> www.Module.config’ of the associated webrole under ‘system. The following table lists the types of diagnostic data that you can configure your application to collect. Compression. Information about IIS sites. Filter. Default Yes Web and Worker Associated role type IIS 7. Logs pertaining to diagnostics infrastructure. Only some of the data sources are added to the diagnostics monitor by default.Page. Cache.A role instance configured with diagnostics module automatically starts the diagnostics monitor which is responsible for collecting diagnostic data. Custom data can be logged to a local storage. Mini/full crash dumps of the application.config or app.config. Collected by Data Source Description Windows Azure Logs Trace messages sent to the trace listener ‘DiagnosticsMonitorTraceListener’ which gets added by default to web. which will then get transferred to windows azure storage. however the rest must be added . Logs events that are typically used for troubleshooting application and driver software Performance counters metrics.aditi.AppServices" verbosity="Verbose" /> <add provider="ISAPI Extension" verbosity="Verbose" /> <add provider="WWW Server" areas="Authentication. RemoteAccess and RemoteForwarder module. Information about failed requests to an IIS site or application.0 Logs WAD Infrastructure Logs Failed request Logs Windows Event Logs Performance Counters Crash Dumps Custom error Logs Yes Yes No No No No No Web Web and Worker Web Web and Worker Web and Worker Web and Worker Web and Worker Collect IIS failed request logs Enabling collection of IIS failed request logs can be done by adding the following to ‘web.webServer’ section: <tracing> <traceFailedRequests> <add path="*"> <traceAreas> <add provider="ASP" verbosity="Verbose" /> <add provider="ASPNET" areas="Infrastructure. Security.

WindowsAzure. Specify Storage Account Upon importing the diagnostics module.EnableCollection(false).ConnectionString’ gets automatically associated. Once transferred.) DiagnosticMonitorConfiguration config = DiagnosticMonitor. The default value is set to ‘UseDevelopmentStorage=true’ and this should be used when running the application on compute emulator.Collect Windows Event Logs Windows event logs collection has to be enabled imperatively via code.ConnectionString".0" encoding="utf-8"?> <ServiceConfiguration serviceName="TroubleShootingSample" xmlns=" the diagnostic data can be viewed with one of several available tools like Cloud Storage . CrashDumps.DataSources.Plugins.AccountName=AccountName. Setup Transfer Diagnostic data collected by WAD is not permanently stored unless it’s transferred to Windows Azure Storage. config). Collect crash dumps Crash dumps collection has to be enabled imperatively via code.Start("Microsoft.GetDefaultInitialConfiguration(). the protocol has to be always set to ‘https’. However it does not require changing of the diagnostic configuration. a configuration setting named ‘" osFamily="1" osVersion="*" schemaVersion="2012-05.7"> <Role name="WorkerRole1"> <Instances count="1" /> <ConfigurationSettings> <Setting name="Microsoft.Diagnostics.Plugins. (To ensure security as logs more often than not contain sensitive information.Add("System!*"). DiagnosticMonitor. config. Also while using Windows Azure Storage account. Azure Storage Explorer etc. This is done by calling the ‘GetDefaultInitialConfiguration’ method of ‘DiagnosticsMonitor’. and then calling the ‘Start’ method of ‘DiagnosticMonitor ‘with the changed configuration.WindowsAzure.Plugins.Diagnostics.ConnectionString" value=" DefaultEndpointsProtocol=https. This is done by calling the method ‘EnableCollection’ of ‘CrashDumps’. This is typically done within the ‘OnStart’ method of the role.AccountKey=AccountKey " /> </ConfigurationSettings> </Role> www.1. </ServiceConfiguration> <?xml version="1. This setting specifies the Windows Azure Storage account to which the diagnostic data will be transferred.WindowsEventLog.aditi. adding the ‘WindowsEventLog’ data source. The Boolean parameter specifies whether full dump is to be collection (true) or mini dump (false).

choosing data buffer associated with the data source whose logs are to be transferred and specifying the timer-interval of those . Click Compute Services. RoleInstanceDiagnosticManager roleInstDiagMgr = diagManager. Record the Name value from the Properties pane. This is the identifier of the role instance.Directories. and then calling the ‘Start’ method on ‘DiagnosticMonitor’ with the changed configuration. transferOptions. The following snippet schedules the transfer of file based logs (IIS logs etc) to ‘every 10 minutes’: DiagnosticMonitorConfiguration config = DiagnosticMonitor.FromMinutes(10). On-Demand Transfer Diagnostic data can also be transferred on-demand from within the role or from an outside application which can be running either on-premise or on Windows Azure. transferOptions). DeploymentDiagnosticManager diagManager = new DeploymentDiagnosticManager(storageAccount. TimeSpan timeInterval = new TimeSpan(3. and then click the node for the role from which you want to collect diagnostic data. The on-demand transfer can be done by using role instance diagnostic manager.UtcNow. Click the deployment for the application.AccountName=<AccountName>. transferOptions. "<DeploymentID>").Start("Microsoft.AccountKey=<AccountKey> "). 0.NotificationQueueName = "wad-on-demand-transfers". role name and role instance name for which diagnostic data is to be transferred.GetDefaultInitialConfiguration().Diagnostics. This is the name of the role. and then expand the node for your application.From = DateTime. Expand the deployment node.Schedule Transfer We need to imperatively change the WAD configuration to schedule transfer of diagnostic data.Parse("DefaultEndpointsProtocol=https. Expand the role node.To = DateTime. we need to obtain the deployment identifier. This is the deployment identifier of your hosted service.Directories. and then click the node for the role instance. config). Guid requestID = roleInstDiagMgr. "<RoleInstanceID>"). Record the Name value from the Properties pane. assigning a ‘TimeSpan’ to the ‘ScheduledTransferPeriod’ property of the data buffer’s configuration property. 0).Subtract(timeInterval). Record the ID value from the Properties pane. choosing the configuration property corresponding to the data buffer. DataBufferName dataBuffersToTransfer = DataBufferName.GetRoleInstanceDiagnosticManager("<RoleName>". Each data source added for collection has an associated data buffer (local disk storage). This can be done by performing the following steps: Log on to the Azure developer portal. The data transfer is scheduled by calling the ‘GetDefaultInitialConfiguration’ method on ‘DiagnosticMonitor’. CloudStorageAccount storageAccount = CloudStorageAccount. www.BeginOnDemandTransfer(dataBuffersToTransfer.ScheduledTransferPeriod = TimeSpaan. transferOptions. OnDemandTransferOptions transferOptions = new OnDemandTransferOptions(). If performing an on-demand transfer from an outside application.Plugins.ConnectionString". DiagnosticMonitor.aditi.UtcNow. config.WindowsAzure.

DiagnosticMonitorTraceListener. This includes IIS logs. PublicKeyToken=31bf3856ad364e35" name="AzureDiagnostics"> <filter type="" /> </add> </listeners> </trace> </system. Version=1.aditi. and custom directories. WADPerformanceCountersTable – Performance Counters.0.Contains information about directories that the diagnostic monitor is monitoring. .7. wad-iis-logfiles – IIS Logs.NET tracing system and collection of trace statements can be enabled in a Windows Azure application by simply adding the ‘DiagnosticMonitorTraceListener’ to the configuration.diagnostics> <trace> <listeners> <add type="Microsoft.Tracing Tracing is a good way to monitor the execution of the application while it is running and also comes in handy for troubleshooting. <custom> . WAD integrates well with .WindowsAzure. (It gets added by default when creating projects in VS. Tables WADLogsTable – Windows Azure Logs WADDiagnosticInfrastructureLogsTable – WAD Infrastructure Logs WADDirectoriesTable . WADWindowsEventLogsTable – Windows Event Logs. IIS failed request logs.0.Diagnostics. The AbsolutePath field indicates the location and name of the file as it existed on the Windows Azure virtual machine. The location of the blob log file is specified in the Container field and the name of the blob is in the RelativePath field. Blobs wad-iis-failedreqlogfiles – Failed Request Logs. Viewing diagnostic data The diagnostic data transferred to Windows Azure Storage is either stored in ‘Blobs’ or ‘Tables’.Custom Error Logs.) configuration> <system. www. Microsoft.diagnostics> </configuration> The trace statements are collected as ‘Windows Azure Logs’ by the diagnostic monitor.WindowsAzure.

w3. The diagnostic monitor periodically polls the configuration XML file for changes and applies them to the running instances.Manage WAD Configuration The configuration information is stored in an XML file in Windows Azure blob storage under /wad-controlcontainer/<deploymentID>/<rolename>/<roleinstance>.org/2001/XMLSchemainstance"> <DataSources> <OverallQuotaInMB>8192</OverallQuotaInMB> <Logs> <BufferQuotaInMB>1024</BufferQuotaInMB> <ScheduledTransferPeriodInMinutes>1</ScheduledTransferPeriodInMinutes> <ScheduledTransferLogLevelFilter>Information</ScheduledTransferLogLevelFilter> </Logs> <DiagnosticInfrastructureLogs> <BufferQuotaInMB>1024</BufferQuotaInMB> <ScheduledTransferPeriodInMinutes>0</ScheduledTransferPeriodInMinutes> <ScheduledTransferLogLevelFilter>Information</ScheduledTransferLogLevelFilter> </DiagnosticInfrastructureLogs> <PerformanceCounters> <BufferQuotaInMB>1024</BufferQuotaInMB> <ScheduledTransferPeriodInMinutes>0</ScheduledTransferPeriodInMinutes> <Subscriptions /> </PerformanceCounters> <WindowsEventLog> <BufferQuotaInMB>1024</BufferQuotaInMB> <ScheduledTransferPeriodInMinutes>1</ScheduledTransferPeriodInMinutes> <Subscriptions> <string>Application!*</string> </Subscriptions> <ScheduledTransferLogLevelFilter>Information</ScheduledTransferLogLevelFilter> </WindowsEventLog> <Directories> <BufferQuotaInMB>0</BufferQuotaInMB> <ScheduledTransferPeriodInMinutes>1</ScheduledTransferPeriodInMinutes> <Subscriptions> <DirectoryConfiguration> <Path>C:\Users\Administrator\AppData\Local\dftmp\Resources\bf046678-2437-4a71-9a65a363c826b5b3\directory\DiagnosticStore\CrashDumps</Path> <Container>wad-crash-dumps</Container> <DirectoryQuotaInMB>1024</DirectoryQuotaInMB> </DirectoryConfiguration> </Subscriptions> </Directories> </DataSources> <IsDefault>false</IsDefault> </ConfigRequest> www. We can change the configuration either remotely from code running outside the Windows Azure application or from within the application itself.aditi. <?xml version="1.w3.0"?> <ConfigRequest xmlns:xsd="" xmlns:xsi=" . Following is sample configuration XML file.

‘RoleInstanceDiagnosticManager’ can then be used to retrieve the current configuration and update it as per the requirement. "<DeploymentID>").Diagnostics.Plugins. PerformanceCounterConfiguration perfCounterConfig = new PerformanceCounterConfiguration(). we need to obtain the deployment identifier. GetConfigurationSettingValue("Microsoft. //Update the configuration roleDiagManager.To remotely update the configuration.SetCurrentConfiguration(currentConfiguration).DataSources. DeploymentDiagnosticManager deploymentDiagManager = new DeploymentDiagnosticManager(storageAccount.FromMinutes(10)" xmlns:xsi="http://www. perfCounterConfig.FromSeconds(5).PerformanceCounters.SampleRate = TimeSpan. foreach (RoleInstanceDiagnosticManager roleDiagManager in roleDiagManagers) { DiagnosticMonitorConfiguration currentConfiguration = roleDiagManager. IEnumerable<RoleInstanceDiagnosticManager> roleDiagManagers = deploymentDiagManager.Add(perfCounterConfig). currentConfiguration. <?xml version="1.GetCurrentConfiguration().0"?> <ConfigRequest xmlns:xsd="http://www.w3.w3. .Parse(RoleEnvironment.ConnectionString")). perfCounterConfig.aditi. Following code snippet shows how to add collection of a performance counter to the configuration and enable its transfer to windows azure storage. CloudStorageAccount storageAccount = CloudStorageAccount.CounterSpecifier = @"\Processor(_Total)\% Processor Time". role name and role instance name as explained in ‘On-Demand Transfer’ section. } Following is the updated sample configuration XML file on running the above code snippet.PerformanceCounters.GetRoleInstanceDiagnosticManagersForRole("<Role Name>").ScheduledTransferPeriod ="> <DataSources> <OverallQuotaInMB>8192</OverallQuotaInMB> <Logs> <BufferQuotaInMB>1024</BufferQuotaInMB> <ScheduledTransferPeriodInMinutes>1</ScheduledTransferPeriodInMinutes> <ScheduledTransferLogLevelFilter>Information</ScheduledTransferLogLevelFilter> </Logs> <DiagnosticInfrastructureLogs> <BufferQuotaInMB>1024</BufferQuotaInMB> <ScheduledTransferPeriodInMinutes>0</ScheduledTransferPeriodInMinutes> <ScheduledTransferLogLevelFilter>Information</ScheduledTransferLogLevelFilter> </DiagnosticInfrastructureLogs> <PerformanceCounters> <BufferQuotaInMB>1024</BufferQuotaInMB> <ScheduledTransferPeriodInMinutes>10</ScheduledTransferPeriodInMinutes> <Subscriptions> <PerformanceCounterConfiguration> <CounterSpecifier>\Processor(_Total)\% Processor Time</CounterSpecifier> <SampleRateInSeconds>5</SampleRateInSeconds> </PerformanceCounterConfiguration> </Subscriptions> </PerformanceCounters> <WindowsEventLog> www. .Following is the updated sample configuration XML file on running the above code snippet. Mini-Dump Setup Perform the following steps to prepare the environment for analyzing the crash dump: Download the dump that is to be analyzed to the local machine using one of the tools like Azure Storage Explorer. www. <BufferQuotaInMB>1024</BufferQuotaInMB> <ScheduledTransferPeriodInMinutes>1</ScheduledTransferPeriodInMinutes> <Subscriptions> <string>Application!*</string> </Subscriptions> <ScheduledTransferLogLevelFilter>Information</ScheduledTransferLogLevelFilter> </WindowsEventLog> <Directories> <BufferQuotaInMB>0</BufferQuotaInMB> <ScheduledTransferPeriodInMinutes>1</ScheduledTransferPeriodInMinutes> <Subscriptions> <DirectoryConfiguration> <Path>C:\Users\Administrator\AppData\Local\dftmp\Resources\f3c856df-f9d6-4bff-995a89b93640b6ce\directory\DiagnosticStore\CrashDumps</Path> <Container>wad-crash-dumps</Container> <DirectoryQuotaInMB>1024</DirectoryQuotaInMB> </DirectoryConfiguration> </Subscriptions> </Directories> </DataSources> <IsDefault>false</IsDefault> </ConfigRequest> Analyzing Crash Dumps using WinDbg The crash dumps are collected to a blob named ‘wad-crash-dumps’ under which a virtual blob directory hierarchy is created for each role and its instances which crashed. Cloud Storage Studio etc.

com . Note: The example being showcased is for a mini-dump. www.Install WinDbg from http://msdn. Make sure to select the ‘Debugging tools for Windows’ feature in the installation wizard.aditi. Analyze To being (Select the OS version appropriately). start ‘WinDbg’ and open the crash dump file (File -> Open Crash Dump or press CNTRL + D) downloaded from windows azure storage.

aditi. This can be done by the command ‘.com .The first step to perform after loading the crash dump is to load ‘SOS.loadby SOS clr’.NET framework and helps in debugging managed programs. www. The next step is to view the managed stack trace at the time of crash. This is done by the command ‘!CLRStack -a’.dll’ which is shipped with .

This is done by the command ‘!CLRStack -a’.com . As shown in the screenshots above. the line of code responsible for the exception is rightly identified from the stack trace. www. (CrashTask.The next step is to view the managed stack trace at the time of crash. Ln 15) To print the exception details.cs.aditi. use the command ‘!pe’. . Follow the setup steps mentioned for mini-dump. (Notice the size of the dump file in the screenshot below. it’s grown to 198 MB from 16MB in the case of mini-dump) www.Note: Unhandled exception will also come up in Windows Event Logs Full Dump Analyzing full dump is similar to mini-dump. with the difference being the access to the memory heap.

com . we can drill down into the objects pertaining to the parameters.aditi. However. www.Run the command ‘!clrstack –a’ like we did for mini-dump. The region marked in red corresponds to the parameter being passed to the ‘InvalidOperationException’ (see the source code snapshot in previous section). We can find out the contents of the parameter object using the command ‘!dumpobj 00000000029718c0’. since we now have access to the memory heap.

aditi. Following points should be given due consideration: The WAD storage account being used to transfer the logs should be in the same data center as the application.Similarly. TimeSpan scheduledTransferPeriod = TimeSpan.ScheduledTransferPeriod = scheduledTransferPeriod.Run’ in the previous screenshot) A list of all the commands available can be found at http://msdn. Best Practices Optimal WAD configuration It is quite important to have a suitable WAD configuration to prevent excess cost as well affecting the application’s performance.aspx . we can use the command ‘!dumpobj 0x000000000296d2c0’ (The object address retrieved from the ‘LOCALS’ section in the method call ‘WorkerRole1. This will prevent incurring transaction costs for the data transfer.cscfg’ could have the following setting: <Setting name="ScheduleTransferPeriodInMinutes" value="1"/> while ‘ServiceConfiguration.GetConfigurationSettingValue("ScheduledTransferPeriod").GetDefaultInitialConfiguration(). . ‘ServiceConfiguration.WindowsAzure.Diagnostics.Start("Microsoft.Local. Use configuration setting for the transfer schedule period.Plugins. This setting can then be set appropriately for different environments by making use of the ‘multiple service configuration files’ feature.ToDouble(scheduledTransferPeriodInMins)). config).cscfg’ could have the setting: <Setting name="ScheduleTransferPeriodInMinutes" value="1"/> The WAD configuration code in the role can then be modified as shown below to make use of the configuration". var scheduledTransferPeriodInMins = RoleEnvironment. to look at the ‘CrashTask’ object that was used for calling ‘Execute’. DiagnosticMonitor.FromMinutes(Convert. config. DiagnosticMonitorConfiguration config = DiagnosticMonitor. For example.

var logLevelFilterVal = RoleEnvironment. config.Logs.WindowsEventLog. add a <LocalStorage> element for ‘DiagnosticStore’ with the ‘sizeInMB’ attribute set to the new size to the ‘ServiceDefinition. ServiceConfiguration.ConnectionString". If you have configured lot of data sources for collection.e.GetConfigurationSettingValue("LogLevelFilter"). The default is zero which means less than ‘OverallQuotaInMB’ and it can also be explicitly set www. DiagnosticMonitor. If it does.cscfg <Setting name="LogLevelFilter" value="Error"/> ServiceConfiguration.GetDefaultInitialConfiguration().This will certainly help the developers to access the logs fast enough when debugging applications on local environment and prevent cost as well as performance overhead when the application is deployed to the cloud.Parse(typeof(LogLevel). logLevelFilterVal). the default value of ‘OverllQuotaInMB’ may not always be sufficient. We can use configuration setting to specify the logging level (as done above for transfer schedule period).ScheduledTransferLogLevelFilter = logLevelFilter. For example. . config. The maximum value is capped by size of the disk pertaining to the –VM instance and by the ‘OverallQuotaInMB’ property of the ‘DiagnosticMonitorConfiguration’ class.Cloud. so you are left with about 161GB for the application. It is also important to remember that ‘OverallQuotaInMB’ is shared amongst all the data sources and that each corresponding data buffer can be individually configured to have its own max value by setting the property ‘BufferQuotaInMB’. ‘OverallQuotaInMB’ is set to 4GB. Correctly Configure Data Buffer Sizes We need to have a rough estimate of the total storage required for the data sources that have been configured for collection by WAD. By default. if the filter is set to ‘Warning’ the both ‘Critical’ and ‘Error’ are included.ScheduledTransferLogLevelFilter = logLevelFilter.DiagnosticInfrastructureLogs. config). then there is a risk of the collected data getting overwritten (oldest data is deleted as new data is added) before it is transferred to windows azure storage (The deletion of oldest data occurs after transfer too).Start("Microsoft. It is therefore important to take care while setting the individual data buffer sizes so that the aggregate value does not exceed ‘OverallQuotaInMB’.Local. ‘OverallQuotaInMB’ sets the rewritable wraparound buffer for all the diagnostic data collected from all the configured data sources.cscfg <Setting name="LogLevelFilter" value="Information"/> DiagnosticMonitorConfiguration config = DiagnosticMonitor. The transfer of ‘Windows Azure Logs’ and ‘Windows Event Logs’ must be regulated by appropriately setting the filter for log level.csdef’ file and change the ‘OverallQuotaInMB’ value accordingly. then WAD will fail and the only way to see the error is to attach a debugger or have a try-catch block. Following are the different log levels: Critical Error Warning Information Verbose The level is ‘cumulative’ i. Now. LogLevel logLevelFilter = (LogLevel)Enum.Diagnostics.WindowsAzure.ScheduledTransferLogLevelFilter = logLevelFilter. To go beyond the default value. when using a ‘small’ instance the maximum size of local storage available is 165GB.aditi.

com . DiagnosticMonitorConfiguration config = DiagnosticMonitor. e-social and mobile.NET technology ecosystem. // Set the individual data buffer explicitly and make sure it is less than the OverallQuotaInMB set https://www. web businesses and enterprises leverage the power of cloud.BufferQuotaInMB = 1024.DiagnosticInfrastructureLogs.Start("Microsoft.ConnectionString".Logs. For example.aditi. We are one of the top 3 Platform-as-aService solution providers globally and one of the top 5 Microsoft technology partners in US. WWF. It is therefore better to enable collection of mini-dumps unless a full dump is absolutely needed. WPF.BufferQuotaInMB = 0. He has had the opportunity to get his hands dirty with WCF.Directories.Note: By setting ‘cleanOnRoleRecyle’ attribute to ‘false’ weensure that data is wiped out when the role https://twitter. We provide innovation solutions in 4 domains: Digital Marketing solutions that enable online businesses increase customer acquisition Cloud Solutions that help companies build for traffic and computation surge Enterprise Social that enables enterprises enhance collaboration and productivity Product Engineering services that help ISVs accelerate time-to-market www. Since windows azure VM instances run 64-bit version of Windows. ASP. config.facebook.Diagnostics.PerformanceCounters. config).com/company/aditi-technologies http://adititechnologiesblog. the programming language that he is proficient being C#.BufferQuotaInMB = 1024. for example to examine memory leaks.GetDefaultInitialConfiguration().WindowsAzure. this does not guarantee that the data will remain if the instance is moved (hardware problem etc.). config.WindowsEventLog.BufferQuotaInMB = 1024. We are passionate about emerging technologies and are focused on custom development.BufferQuotaInMB = 1024. Over the past 4 years he has been focusing on designing Windows Azure based solutions.linkedin. However.OverallQuotaInMB = 8192.Plugins. He has been involved in developing applications using the various offerings of . the full crash dump of an ExtraLarge VM instance can go up to 14GB (worst case scenario). config.NET . DiagnosticMonitor. the size of full dump files can be quite www. to drive competitive advantage. config. // Use the rest of the storage here config. http://www. Use mini-dumps instead of full dumps Full dump files contain the process’s memory at the time of crash. About Aditi Aditi helps product companies. // Set an overall quota of 8GB. Brihadish Kaushik is Technical Lead at Aditi Technologies. analyze object structures etc.

Sign up to vote on this title
UsefulNot useful