J2i.Net

Nothing at all and Everything in general.

Getting Assembly Version Number in WinRT

It seems that every time one makes a shift from one .Net platform to another the method you use to reflect the version number for an assembly changes. I needed to get this information for a WinRT program earlier and was able to do so with a regular expression and the slightly different reflection methods in WinRT. The code follows:

 

var asmName = this.GetType().AssemblyQualifiedName;
var versionExpression = new System.Text.RegularExpressions.Regex("Version=(?<version>[0-9.]*)");
var m = versionExpression.Match(asmName);
string version = String.Empty;
if (m.Success)
{
    version = m.Groups["version"].Value;
}

External Motion Sensor for Windows 8

I ordered the STEVAL-MKI119V1 eMotion sensor board for Windows 8 earlier this week and it showed up today. Unforunately it looks like I can't start using it just yet. To use the device it is necessary to update the firmware on the board. But the firmware isn't openly available on the website of the vendor. You've got to fill out a request form for it. If/when it is approved the software will be made available to you. 

 

If you plan on ordering on go ahead and fill out the form the same day that you purchase it so that you might have a chance of having the approval by the time the hardware arrives. You can request access to the software here: http://www.st.com/internet/evalboard/product/252756.jsp

The instructions on performing the update can be found here:

http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/USER_MANUAL/DM00041151.pdf

 

 

New Meetup Presentation Series: Getting Started with Windows Phone

Over the next few weeks the Atlanta Windows Phone Developer Meetup group will be having a series of presentations on getting started with Windows Phone development. Many of the presentations will be done by Glen Gordon. I will be doing a few of the presentations too. These presentations are targeting developers that haven't yet gotten started with Windows Phone. So if you've not done any WP development before no worries, these presentations should be fine for you. If you are interested you can find information on the next and future meetings on the Meetup page here: http://www.meetup.com/Win-Phone-7-Developers-Atlanta/

 

Side note: For the presentations I'm doing the functionality works almost exactly the same for Windows Phone and Windows 8. I'm tempted to make an online presentation for both and put it on the web. But the real world presentation will concentrate only on Windows Phone. 

 

Windows 8 : Development Installation Options - USB (Windows to Go)

The consumer preview of Windows 8 has been available for almost a month now. I've got the OS running as the main operating system on three of my computers; a slate, a tablet, and a desktop. While running natively on hardware is, in my opinion, the best way to run Windows 8, there are various reasons for which some one may want to do this such as not having access to additional hardware (or the desired configuration being to expensive for multiple machines).

 

I was looking at some of the other options for running Windows 8. These included the following:

 

  • Hardware Emulation
    • VMWare
    • Sun Virtual Box
    • VMWare
  • USB Installation
  • VHD Boot
Hardware Emulation is straight forward. I tried out Sun Virtual Box during the first release of Windows 8 CTP (at the time it was the only emulator that would work!). Since then VMWare has updated their products to have compatibility too. Though with VMWare there are a few things you'll want to do to ensure that you don't run into problems. When you are creating the new Virtual Machine select the option to install the operating system later. The automatic OS installation doesn't yet target Windows 8. Secondly you will want to edit the virtual machine configuration and remove the floppy disk. There's some other nebulous error that you will get during installation if the floppy disk is still present. I've found the performance to be acceptable within emulation (your mileage may varry since emulation performance is necessarily dependent on the performance of the host machine). But I find it more usable when it is running in full screen mode. It is difficult to hit the one pixel boundries on the edges of the screen in windowed mode. 

USB installation for the most part worked pretty well. But during operations that required heavy IO for the boot drive other task that required some level of IO appeared to freeze (which are more operations than you would think). To make a USB installation you will need to already have an up and running installation of Windows 8 (either in an emulator or on a real device). The instructions for the process can be found here.  If you look in the Windows Explorer you'll see that the main drive still shows as a removable drive. 

This causes an issue with installing the developer tools. The developer tools will *not* install on a removable drive. 

 

I tried to do a few things to circumvent this and finally found something that works. I had to make a VHD on my USB key and mount it and use that as the installation target.  I assigned this VHD drive the letter [P:] and tried the installation process again and it worked. 

 

 

VHD means Virtual Hard Disk. It's a file that contains an entire file system packaged in a file. Virtual PC uses this format but you can also make them at will from the disk manager. To make a VHD do the following. 

 

  1. Open the Control Panel
  2. Open  "Administrative Tools"
  3. Open "Computer Management"
  4. On the left pane select "Disk Management" under the "Storage" group
  5. Right-click on "Disk Management" and select "Create VHD"
    1. Select the maximum size of the virtual drive
    2. Select whether you want to preallocate the space or allocate it as needed
  6. Click on OK. The drive is automatically mounted 
  7. Right-click on disk name  and select "Initialize."
  8. Click on OK on the dialog that shows up
  9. Right-click on the area to the right of the disck and select "New Simple Volume."
  10. On the dialog that shows click on "Next" until you are able to select a drive letter. 
  11. Select a drive letter and click "Next"
  12. Enter a name for the volume and click "Next"
  13. Click "Finish."
After you click on "Finish" the drive will be available in the file explorer and available as an installation target. 

I'm still looking into the option of booting up from a VHD. With VHD bootup the operating system would b installed in a virual hard drive that is saved on the primary drive. When the computer is turned on you get the option of booting up into the main OS or into the VHD.  But it seems that my hard drive encryption software is causing some problems. I'll post back after I get the VHD option working. 


 

 

How Big is this Object on Windows Phone

I wanted to validate some of my understanding on the amound of memory that a .Net instance occupies on Windows Phone and sent a request to Abhinaba Basu of the .Net Compact Framework team for more information. He promptly responded with a nice explanation of how a .Net object is laid out in memory. Instead of reposting his work I'd like to refer you to his blog and the article he wrote explaining this. I'll be making some references to it in the near future. 

image

http://blogs.msdn.com/b/abhinaba/archive/2012/02/02/wp7-clr-managed-object-overhead.aspx

Sparse Array class for .Net

Download Code(312 Kb)

Introduction

I had the need for a dynamically growing sparsely populated array. "Sparse" implies that there will be a lot of elements that contain empty values between the ones that contain non-empty values. The .Net collections namespace doesn't contain anything that meets this need. The ArrayList class dynamically grows, but it would have elements allocated for the empty and non-empty values alike. I plan to use this code on devices that have limited memory so this wouldn't do. I ended up making my own class to satisfy this need. While my initial need for this code is for byte arrays I made the code generic so that it can be used with other data types.

How Big is this Class?

Since I will be talking about memory allocation I'll need to also talk about how to get a sense of how much memory that an instance of something costs. This won't account for 100% of the memory that is consumed by by allocation of the object. But it will be close enough for you to start making judgements about if one object cost more memory or less memory than another.

 

There are a number of predefined value types for which the size is well known. Here are some of the most common ones.

TypeSize (in bytes)
byte 1
int 4
short 2
float 2
double 4
char 2

If you build a struct it is a value type too. The size of a struct will be about the sum of the size of its members. Here is an example.

struct Location
{
    public int LocationID;   // 4 bytes
    public double Latitude;  // 4 bytes
    public double Longitude; // 4 bytes
    public double Elevation; // 4 bytes
}

The size of this structure is 16 bytes. Now what happens if you add a reference type (something that is defined as a class instead of a struct)? How big will it be? I'll add a string to the previous example.

struct Location
{
    public int LocationID;      // 4 bytes
    public string LocationName; // ?
    public double Latitude;     // 4 bytes
    public double Longitude;    // 4 bytes
    public double Elevation;    // 4 bytes
}

There are two elements of memory to consider for the reference field. There is the size of the reference and the size of the object to which it is pointing. Think of the LocationName field in the example above as holding a memory address to the area where the string is being held. The size of this memory address will depend on what type of processor architecture that the code is running against. If the code is JIT compiled for a 32-bit system then the refernece will be 32-bits (4 bytes) in size. If it is JIT compiled for a 64-bit system then the refenrece will by 64-bits (8 bytes) in size. When I am working with just a desktop then I do my calculations based on 64-bit systems. But the code on this article will run on Windows Phone so I will be taking both into consideration. There is also other elements within an objects header that is around 8 bytes. The other element of memory to take into the consideration is the size of the string itself. If the string has not yet been assigned and the element is null then the second element of size to consider will have a size of zero. If the string is assigned a value then the second element will be what ever memory is consumed by the string.

It is possible for multiple structs to refer to the same instance of a refernece type. When this occurs you'll want to make sure that you don't count the memory taken up by the instances of the reference type multiple times.

Now let's add an array and see what that does for our memory calculations. The array itself is a reference type. so the amount of memory the reference to it will consume is dependent on the processor architecture.

struct Location
{
    public int LocationID;         // 4 bytes
    public string LocationName;    // 4/8 bytes + string memory
    public double Latitude;        // 4 bytes
    public double Longitude;       // 4 bytes
    public double Elevation;       // 4 bytes
    public int[] SomeRandomArray;  // 4/8 bytes + array memory
}

The size of the memory associated with the array will be the sum of the size of the elements that it contains. If the array contains value types (the above contains a value type of int) then the memory allocated when the array is initialized is sizeof(int) (4 bytes) multiplied by the number of elements in the array. If the array contains reference types then the memory consumed by the array will be the size of the reference (4+8 bytes on a 32-bit system, 8+8 bytes on a 64-bit system) multiplied by the number of elements that the array can hold. This doesn't include the size of the individual instantiated instances of elements it contains.

What happens if I change any of the above from a struct to a class? Once a type is made into a reference type it is going to also have an object header. For a struct when you declare a variable all of the memory that is going to be used by the immediate members is going to be allocated. With a class only the memory for the reference (4 bytes on 32-bit, 8 bytes on 64-bit) is allocated. The memory needed for the members of an reference type is not allocated until there is a call to new. Also note that the memory for the instances of a reference type are allocated on the heap while for a value type they are allocated on the stack.

A Look Sparse and Contiguous Memory Allocation

If we take a look at how memory is allocated for a sparse array and a contiguous (normal) array the reason I needed this will be more clear. As an oversimplified example let's say that that an array of 40 elements must be allocated. Regular arrays will allocate the memory contiguously; the bytes associated with the array will be in the same block of memory. Lets say our sparse array is allocating blocks of memory for five elements at a time. Below the blocks that are in blue represent areas in which there is non-empty data.

Contiguous Array
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
Sparse Array
00 01 02 03 04
05 06 07 08 09
20 21 00 00 00
00 00 00 00 34
35 36 37 38 39

At a glance the sparse array is taking up less memory than the conventional array. There are still some empty elements within the structure. This is because it's allocating memory for a range of index. If there's even a single non-empty element within a 5 block range then there's an allocation for all 5 blocks. Whether or not this results in less memory being allocated overall is going to depend on the usage patterns for the sparse array. It will work best when the populated elements are clustered closely to each other.  Can we get rid of the empty elements all together? We could by reducing the size of the chunks allocated. The smaller the chunks the lower the opportunity for empty positions within chunks to exists. If memory was allocated for single elements (a chunk that only holds one element) then we would only have populated elements in the list. But is this better?

It may be. But to get a better answer to this question there's a cost that as of yet has remained invisible for the sparse array. There's a structure for holding  each chunk that looks like the following in my implementation.

public class ArrayChunk<T>
{
    public int StartIndex { get; set;  } //4 bytes needed to contain an integer value

    public int EndIndex
    {
        get
        {
            if (Data == null)
                return -1;
            return StartIndex +  Data.Length-1;
        }
    }
    public T[] Data { get; set;  }
}

In addition to the array that holds the elements of the chunk itself there is also a field to hold the start index for the array. The start index consumes 4 bytes of memory. So there is an overhead of no less than 4 bytes for each chunk allocated.  Consider a conventional and a sparse array both of which are fully populated with 100 bytes of data. Also assume that the sparse array allocates memory for 10 bytes at a time. The memory consumed by the conventional array will be about 100 bytes. The memory consumed by the sparse array will be around 140  bytes. If I reduced the size of the chunks to only having data for 2 bytes of memory then we end up needing at least 6 bytes for each chunk. For the fully populated 100 element collection this would translated to no less than 600 bytes. With results like that one may wonder why even bother with the sparse array.  But consider another scenario. Lets say that only 20 elements of the array are populated with 10 elements at the begining of the array and 10 at the end. For the conventional array the memory consumed will still be at least 100 bytes. For the sparse array it is around 28 bytes. Something that may become apparent is that the best case scenarios for this sparse array will occur when it partially populated and the populated data items are clustered close to each other.

Growing

The conventional array cannot grow. Once allocated its size is fixed. There are other collection classes within .Net that can grow such as the List<T> derived classes or the MemoryStream class. My understanding is that once the memory buffer for any of these classes is consumed it will allocated a new memory buffer of twice the size as the what it had, copy all of it's data to the new buffer, and then discard the old buffer to be reclaimed by garbage collection later. In trying to confirm this I found the source code for the MemoryStream class. The code of interest is below

 

            //From https://singularity.svn.codeplex.com/svn/base/Libraries/System.IO/MemoryStream.cs

            private bool EnsureCapacity(int value) {
            // Check for overflow
            if (value < 0)
                throw new IOException("IO.IO_StreamTooLong");
            if (value > _capacity) {
                int newCapacity = value;
                if (newCapacity < 256)
                    newCapacity = 256;
                if (newCapacity < _capacity * 2)
                    newCapacity = _capacity * 2;
                Capacity = newCapacity;
                return true;
            }
            return false;
        }

        // Gets and sets the capacity (number of bytes allocated) for this stream.
        // The capacity cannot be set to a value less than the current length
        // of the stream.
        //
        //| <include file='doc\MemoryStream.uex' path='docs/doc[@for="MemoryStream.Capacity"]/*' />
        public virtual int Capacity {
            get {
                if (!_isOpen) __Error.StreamIsClosed();
                return _capacity - _origin;
            }
            set {
                if (!_isOpen) __Error.StreamIsClosed();
                if (value != _capacity) {
                    if (!_expandable) __Error.MemoryStreamNotExpandable();
                    if (value < _length) throw new ArgumentOutOfRangeException("value", "ArgumentOutOfRange_SmallCapacity");
                    if (value > 0) {
                        byte[] newBuffer = new byte[value];
                        if (_length > 0) Buffer.BlockCopy(_buffer, 0, newBuffer, 0, _length);
                        _buffer = newBuffer;
                    }
                    else {
                        _buffer = null;
                    }
                    _capacity = value;
                }
            }
        }

This behaviour is just fine for typical scenarios, but I will be working with what are relatively large buffers (in comparison to the memory available on the devices on which I will be running my code). So I'd prefer to keep the ceiling for the maximum amount of memory allocation that occurs within the programs that I have in mind. It's also worth mentioning that the .Net runtime treats "large" objects differently than it does small ones. For more information take a look at The Dangers of the Large Object Heap. Large objects (around 85,000 bytes or larger) or allocated on a separate heap than small objects (under 85,000 bytes). During garbage collection the .Net garbage collector will try to condense the objects in the smaller heaps to being in contiguous memory. Objects in the LOH (Large Object Heap) are not as easily addressed. From the referenced article:

Large objects pose a special problem for the runtime: they can’t be reliably moved by copying as they would require twice as much memory for garbage collection. Additionally, moving multi-megabyte objects around would cause the garbage collector to take an unreasonably long time to complete.


.NET solves these problems by simply never moving large objects around. After large objects are removed by the garbage collector, they leave behind holes in the large object heap, thereby causing the free space to become fragmented. When there’s no space at the end of the large object heap for an allocation, .NET searches these holes for a space, and expands the heap if none of the holes are large enough. This can become a problem. As a program allocates and releases blocks from the large object heap, the free blocks between the longer-lived blocks can become smaller. Over time, even a program that does not leak memory, and which never requires more than a fixed amount of memory to perform an operation, can fail with an OutOfMemoryException at the point that the largest free block shrinks to a point where it is too small for the program to use.

There are improvements in the LOH in .Net 4.5. Also note that on devices with more constrained memory (such as Windows Phone) there is no LOH. But there are still advantages to avoiding fragmented memory conditions.

Collection of Chunks

While the code I've written is mean to be a type of collection class the code is still dependent on a collection class for holding onto the chunks that it has. It is possible to use one of the List<T> classes for this or a Dictionary<T1,T2> for this. I've decided to go with the List<T>. Now doesn't it look dubious that I talked about the memory usage patterns of the List<T> class and now I'm using it in my underlying implementation! Isn't that just going to cause the same problem that I described above with memory fragmentation? Well, no, at least not as severe. The List<T> contains references to ArrayChunk instances. So these will either be 4 bytes or 8 bytes. Let's assume the worst case which is 8 bytes. To grace the large object boundaries the array list would need to grow to more than 10,000 elements ( (85,000 / 8)=num of items needed to make the ArrayList large. 85,000 is the large object size and 8 is the amount of bytes needed to store a reference). The number of elements needed to make the array list this size is going to depend on how many elements you allow to be stored in each ArrayChunk. When the array does become large enough to occupy the LOH area the scenario is still better than what would happen with a conventional array since the block of memory occupying the LOH is smaller than the block of memory that would have been occupied by a contiguous array of the elements.

For what the sparse array is capable of doing in the code presented with this writeup the LinkedList<T> would have been suitable (actually more suitable). The List<T> that I use is primarily stepping forward and backwards in the list (I wrote the code making the assumption that read and writes to the list will tend to be clustered close to each other). I used the List<T> because it seems to be a better fit for some modifications I plan to do to this code in the future. I won't detail the details of those plans now since they could change. Consequently of those plans do change then I may swap out the List<T> with the LinkedList<T>. When I do make changes I want to ensure that I don't break any of the existing behaviour of the class. The project for this code also contains a few unit test to validate that the behaviour of the ArrayList<T> doesn't vary from expectations.

Testing

There's a small test project included with the code. It will become more important as time goes on because as I make additions to this code I want to ensure that I don't break any of the behaviours that are already present. Right now the test are just checking against the virtual size of the array, ensuring the memory chunks are allocated or deallocated as expected, and that data is preserved.

What is EmptyValue For?

In the sparse array cells that have not been written to are to be treated as though they exists and they contain some default value. That default value is stored in EmptyValue. There are multiple uses for this member. If there is an attempted read from an unallocated cell EmptyValue is returned. When the sparse array is searching for chunks to deallocate it will check the contents of that chunk to see if all of its elements are equal to EmptyValue. When a new SparseArray is instantiated the EmptyValue field is initialized by calling the default constructor/initializer for the type that it hosts. For the numeric types this will end up being a zero value. If this isn't appropriate for the data type that you are using there is a delegate named IsEmptyValue to which you can assign a method that returns true to indicate that a value or instance should be considered empty and false otherwise.

Using the Code

Use of this code is not much unlike how you might use a strongly typed fixed size array. The main difference is the upper bound on the SparseArray<T> grows as needed to accomodate writes to positions within the least. Reading from locations beyond the upper bound will not result in an exception. If some one reads from an indix that is above the size of the array it remains unchanged. But if someone writes to a location above the upper limit then the array will update its reported size and allocate an ArrayChunk if needed.

Class Interface

Properties
Name Access Description
ChunkSize int Indicates the number of elements that each ArrayChunk can hold
Length int Returns the virtual size of the sparse array
ChunkCount int Returns the total number of chunks that the sparse array has.
this[int index] (Generic) Indexer for accessing contents of array
Methods
Name Description
Condense Searches through the sparse array and removes the chunks that contain only empty values
ctor() default constructor. Will create a sparse array with a chunk size of 256 elements
ctor(int size) Creates a sparse array with a chunk size that is determined by the size parameter passed.
ctor(int size, int chunkCount) Creates a sparse array with a chunk size that is determined by the size parameter passed. The chunkCount may be used as a hint for preallocating some memory.

Usage Example

 //Initialize an instance of the array
SparseArray<int> myArray = new SparseArray<int>();

//Populate the array with a few elements
myArray[15] = 23;
myArray[1024] = 2;

//Print the size of the array. Will be 1025 (remember this is a zero based array)
Console.Out.WriteLine("Virtual size of the array: {0}", myArray.Length);

//Read from a position beyond the limit
Console.Out.WriteLine("Value in position 2048 : {0}", myArray[2048]);

//Print the size of the array. Will be 1025 (remember this is a zero based array)
Console.Out.WriteLine("Virtual size of the array: {0}", myArray.Length);

//Show how many chunks the sparse array has. 
Console.Out.WriteLine("Chunk count: {0}", myArray.ChunkCount);

//Set the only populated element in the second chunk to zero and then condense the array
myArray[1024] = 0;
myArray.Condense();

//Check the chunk size again. It should be reduced. 
Console.Out.WriteLine("Chunk count: {0}", myArray.ChunkCount);

What is Listening on my Port!

I wanted to do a quick post on a last minute problem and solution I encountered.

I'm heading out to do a presentation on Azure for Windows Phone developers in about 30 minutes and as I was running through my code I came across a problem; the Azure Emulator was hosting my services on port 444 instead of the HTTPS port (443). It was obvious that something must be using port 443 and that the service has decided to use the next port up (444). Problem is the application is not configured to run on this port and I didn't want to go through and start updating configuration files just before a presentation. So how do I find what is occupying that port?

There's a command line tool called netstat that can be used to answer this question. I'll save you the trouble of looking through it's documentation. The exact command to type in is as follows:

netstat -o -a

It will return a list of addresses, ports, and process IDs. I looked for the line that mentioned port 443 and saw a process ID of 4332. From there I opened the task manager ( [CTRL]+[SHIFT]+[ESC]) and found process 4332 in the "Services" tab. It was a service from VMWare. I shut down the service, stopped the Azure emulator, and restarted the project. This time it came up on port 433!

Adjusting Microsoft Translator WAVE Volume

Video Entry
Download Code (32 Kb)

 

The code in this article was inspired by some questions on Windows Phone 7, but it's generic enough to be used on other .Net based platforms. In the Windows Phone AppHub forums there was a question about altering the volume of the WAVE file that the Microsoft translator service returns. In the StackOverflow forums there was a question about mixing two WAVE files together. I started off working on a solution for the volume question and when I stepped back to examine it I realized I wasn't far away from a solution for the other question. So I have both solutions implemented in the same code. In this first post I'm showing what I needed to do to alter the volume of the WAVE stream that comes from the Microsoft Translation service.

I've kept the code generic enough so that if you want to apply other algorithms to the code you can do so. I've got some ideas on how the memory buffer for the sound data can be better handled that would allow large recordings to be manipulated without keeping the entire recording in memory and allowing the length of the recording to be more easily altered.  But the code as presented demonstrates three things:

  1. Loading a WAVE file from a stream
  2. Alter the WAVE file contents in memory
  3. Save WAVE files files back to a stream

The code for saving a WAVE file is a modified version of the code that I demonstrated some time ago for writing a proper WAVE file for the content that comes from the Microphone buffer.

Prerequisites

I'm making the assumption that you know what a WAVE file and a sample are.I am also assuming that you know how to use the Microsoft Translator web service.

Loading a Wave File

The formats for WAVE files is pretty well documented. There's more than one encoding that can be used in WAVE files, but I'm concentrating on PCM encoded WAVE files and will for now ignore all of the other possible encodings. The document that I used can be found here.  There are a few variants from the document that I found when dealing with real WAVE files and I'll comment on those variants in a moment. In general most of what you'll find in the header are 8, 16, and 32-bit integers and strings. I read the entire header into a byte array and extract the information from that byte array into an appropriate type. To extract a string from the byte array you need to know the starting index for the string and the number of characters it contains. You can then use Encoding.UTF8.GetString to extract the string. If you understand how numbers are encoded (little endian) decoding them is fairly easy. If you want to get a better understanding see the Wikipedia article on the encoding.

Integer Size Extraction Code
8-bit data[i]
16-bit (data[i])|(data[i+1]<<0x08)
32-bit (data[i])|(data[i+1]<<0x08)|(data[i+2]<<0x10)|(data[i+3]<<0x18)

Offset Title Size Type Description
0 ChunkID 4 string(4) literal string "RIFF"
4 ChunkSize 4 int32 Size of the entire file minus eight bytes
8 Format 8 string(4) literal string "WAVE"
12 SubChunkID 4 string(4) literal string "fmt "
16 SubChunk1Size 4 int32 size of the rest of the subchunk
20 AudioFormat 2 int16 Should be 1 for PCM encoding. 
22 Channel Count 2 int16 1 for mono, 2 for stereo,...
24 SampleRate 4 int32  
28 ByteRate 4 int32 (SampleRate)*(Channel Count)*(Bits Per Sample)/8
32 Block Align 2 int16 (Channel Count)*(Bits Per Sample)/8
34 BitsPerSample 2 int16  
  ExtraParamSize 2 int16 possibly not there
  ExtraParams ? ? possibly not there
36+x SubChunk2ID 4 int32 literal string "data"
40+x SubChunk2Size 4 int32  
44+x data SubChunk2Size byte[SubChunk2Size]  
         

The header will always be at least 44 bytes long. So I start off reading the first 44 bytes of the stream. The SubChunk1Size will normally contain the value 16. If it's greater than 16 then the header is greater than 44 bytes and I read the rest. I've allowed for a header size of up to 64 bytes (which is much larger than I have encountered). A header size of larger than 44 bytes will generally mean that there is an extra parameter at the end of SubChunk1. For what I'm doing the contents of the extra parameters don't matter. But I still need to account for the space that they consume to properly read the header.

To my surprise the contents of the fields in the header are not always populated. Some audio editors leave some of the fields zeroed out. My first attempt to read a WAVE file was with a file that came from the open source audio editor Audacity. Among other fields the BitsPerSample field was zeroed. I'm not sure if this is allowed by the format or not. It certainly is not in any of the spec sheets that I've found. But when I encounter this I assume a value of 16.

Regardless of whether a WAVE file contains 8-bit, 16-bit-, or 32-bit samples when read in I store the value in an array of doubles. I chose to do this because double works out better for some of the math operations I have in mind.

public void ReadWaveData(Stream sourceStream, bool normalizeAmplitude = false)
{
    //In general I should only need 44 bytes. I'm allocating extra memory because of a variance I've seen in some WAV files. 
    byte[] header = new byte[60];
    int bytesRead = sourceStream.Read(header, 0, 44);
    if(bytesRead!=44)
        throw new InvalidDataException(String.Format("This can't be a wave file. It is only {0} bytes long!",bytesRead));

    int audioFormat = ChannelCount = (header[20]) | (header[21] << 8);
    if (audioFormat != 1)
        throw new Exception("Only PCM Waves are supported (AudioFormat=1)");

    #region mostless useless code
    string chunkID = Encoding.UTF8.GetString(header, 0, 4);
    if (!chunkID.Equals("RIFF"))
    {
        throw new InvalidDataException(String.Format("Expected a ChunkID of 'RIFF'. Received a chunk ID of {0} instead.", chunkID));
    }
    int chunkSize = (header[4]) | (header[5] << 8) | (header[6] << 16) | (header[7] << 24);
    string format = Encoding.UTF8.GetString(header, 8, 4);
    if (!format.Equals("WAVE"))
    {
        throw new InvalidDataException(String.Format("Expected a format of 'WAVE'. Received a chunk ID of {0} instead.", format));
    }
    string subChunkID = Encoding.UTF8.GetString(header, 12, 4);
    if (!format.Equals("fmt "))
    {
        throw new InvalidDataException(String.Format("Expected a subchunkID of 'fmt '. Received a chunk ID of {0} instead.", subChunkID));
    }
    int subChunkSize = (header[16]) | (header[17] << 8) | (header[18] << 16) | (header[19] << 24);
    #endregion

    if (subChunkSize > 16)
    {
        var bytesNeeded = subChunkSize - 16;
        if(bytesNeeded+44 > header.Length)
            throw new InvalidDataException("The WAV header is larger than expected. ");
        sourceStream.Read(header, 44, subChunkSize - 16);
    }

    ChannelCount = (header[22]) | (header[23] << 8);
    SampleRate = (header[24]) | (header[25] << 8) | (header[26] << 16) | (header[27] << 24);
    #region Useless Code
    int byteRate = (header[28]) | (header[29] << 8) | (header[30] << 16) | (header[31] << 24);
    int blockAlign = (header[32]) | (header[33] << 8);
    #endregion
    BitsPerSample = (header[34]) | (header[35] << 8);

    #region Useless Code
    string subchunk2ID = Encoding.UTF8.GetString(header, 20 + subChunkSize, 4);
    #endregion

    var offset = 24 + subChunkSize;
    int dataLength = (header[offset+0]) | (header[offset+1] << 8) | (header[offset+2] << 16) | (header[offset+3] << 24);

    //I can't find any documentation stating that I should make the following inference, but I've
    //seen wave files that have 0 in the bits per sample field. These wave files were 16-bit, so 
    //if bits per sample isn't specified I will assume 16 bits. 
    if (BitsPerSample == 0)
    {
        BitsPerSample = 16;
    }

    byte[] dataBuffer = new byte[dataLength];

    bytesRead = sourceStream.Read(dataBuffer, 0, dataBuffer.Length);


    Debug.Assert(bytesRead == dataLength);


    if (BitsPerSample == 8)
    {
        byte[] unadjustedSoundData = new byte[dataBuffer.Length / (BitsPerSample / 8)];
        Buffer.BlockCopy(dataBuffer, 0, unadjustedSoundData, 0, dataBuffer.Length);

        SoundData = new double[unadjustedSoundData.Length];
        for (var i = 0; i < (unadjustedSoundData.Length); ++i)
        {
            SoundData[i] = 128d*(double)unadjustedSoundData[i];
        }

    }
    if (BitsPerSample == 16)
    {
        short[] unadjustedSoundData = new short[dataBuffer.Length / (BitsPerSample / 8)];
        Buffer.BlockCopy(dataBuffer, 0, unadjustedSoundData, 0, dataBuffer.Length);


        SoundData = new double[unadjustedSoundData.Length];
        for (var i = 0; i < (unadjustedSoundData.Length); ++i)
        {
            SoundData[i] = (double) unadjustedSoundData[i];
        }
    }
    else if(BitsPerSample==32)
    {
        int[] unadjustedSoundData = new int[dataBuffer.Length / (BitsPerSample / 8)];
        Buffer.BlockCopy(dataBuffer, 0, unadjustedSoundData, 0, dataBuffer.Length);

        SoundData = new double[unadjustedSoundData.Length];
        for (var i = 0; i < (unadjustedSoundData.Length); ++i)
        {
            SoundData[i] = (double)unadjustedSoundData[i];
        }
    }

    Channels = new PcmChannel[ChannelCount];
    for (int i = 0; i < ChannelCount;++i )
    {
        Channels[i]=new PcmChannel(this,i);
    }
        if (normalizeAmplitude )
            NormalizeAmplitude();

}

Mono vs Stereo

In a mono (single channel) file the samples are ordered one after another, no mystery there. For stereo files the data stream will contain the first sample for channel 0, then the first sample for channel 1, then the second sample for channel 0, second sample for channel 1, and so on. Every other sample will be for the left channel or right channel. The sample data is stored in memory in the same way. in an array called SampleData. To work exclusively with one channel or the other there is also a property named Channels (of type PcmChannel) that can be used to access that one channel.

public class PcmChannel
{
    internal PcmChannel(PcmData parent, int channel)
    {
        Channel = channel;
        Parent = parent;
    }
    protected PcmData Parent { get; set;  }
    public int Channel { get; protected set; }
    public int Length
    {
        get { return (int)(Parent.SoundData.Length/Parent.ChannelCount);  }
    }
    public double this[int index]
    {
        get { return Parent.SoundData[index*Parent.ChannelCount + Channel]; }
        set { Parent.SoundData[index*Parent.ChannelCount + Channel] = value; }
    }
}

//The following is a simplified interface definition for how the PcmChannel
//data type is relevant to our PCM data. The actual PcmData class has more 
//more members than what follows.
public class PcmData
{
   public double[] SoundData { get; set; }
   public int ChannelCount { get; set; }
   public PcmChannel[] Channels { get; set; }
}

Where's 24-bit support

Yes, there do exists 24-bit WAVE files. I'm not supporting them (yet) because there's more code required to handle them and most of the scenarios I have in mind are going to use 8 and 16-bit files. Adding support for 32-bit files was only 5 more lines of code. I'll be handing 24-bit files in a forthcoming code.

Altering the Sound Data

Changes made to the values in the SoundData[] array will alter the sound data. There are some constrains on how the data can be modified. Since I'm writing this to a 16-bit WAVE file the maximum and minimum values that can be written out are 32,768 and -32,767. The double data type has a range significantly larger than this. The properties, AdjustmentFactor and AdjustmentOffset are used to alter the sound data when it is being prepared to be written back to a file. They are used to apply a linear transformation to the sound data (remember y=mx+b?). Finding the right values for these is done for you through the NormalizeAmplitude method. Calling this method after you've altered your sound data will result in appropriate values being chose. By default this method will try to normalize the sound data to 99% of maximum amplitude. You can pass an argument to this method between the values of 0 and 1 for some other amplitude.

public void NormalizeAmplitude( double percentMax = 0.99d)
{
    var max = SoundData.Max();
    var min = SoundData.Min();

    double rangeSize = max - min+1 ;
    AdjustmentFactor = ((percentMax * (double)short.MaxValue) - percentMax * (double)short.MinValue) / (double)rangeSize;
    AdjustmentOffset = (percentMax * (double)short.MinValue) - (min * AdjustmentFactor);

    int maxExpected = (int)(max * AdjustmentFactor + AdjustmentOffset);
    int minExpected = (int)(min * AdjustmentFactor + AdjustmentOffset);
}

Saving WAVE Data

To save the WAVE data I'm using a variant of something I used to save the stream that comes from the Microphone. The original form of the code had a bug that makes a difference when working with a stream with multiple channels. The microsphone produces a single channel stream and wasn't impacted by this bug (but it's fixed here). The code for writing the wave produces a header from the parameters it is given, then it writes out the WAVE data. The WAVE data must be converted from the double[] array to a byte[] array containing 16-bit integers in little endian format.

public class PcmData
{
    public void Write(Stream destinationStream)
    {
        byte[] writeData = new byte[SoundData.Length*2];
        short[] conversionData = new short[SoundData.Length];

        //convert the double[] data back to int16[] data
        for(int i=0;i<SoundData.Length;++i)
        {
            double sample = ((SoundData[i]*AdjustmentFactor)+AdjustmentOffset);
            //if the value goes outside of range then clip it
            sample = Math.Min(sample, (double) short.MaxValue);
            sample = Math.Max(sample, short.MinValue);
            conversionData[i] = (short) sample;
        }
        int max = conversionData.Max();
        int min = conversionData.Min();
        //put the int16[] data into a byte[] array
        Buffer.BlockCopy(conversionData, 0, writeData, 0, writeData.Length);

        WaveHeaderWriter.WriteHeader(destinationStream,writeData.Length,ChannelCount,SampleRate);
        destinationStream.Write(writeData,0,writeData.Length);
    }
}

public class WaveHeaderWriter
{
    static byte[] RIFF_HEADER = new byte[] { 0x52, 0x49, 0x46, 0x46 };
    static byte[] FORMAT_WAVE = new byte[] { 0x57, 0x41, 0x56, 0x45 };
    static byte[] FORMAT_TAG = new byte[] { 0x66, 0x6d, 0x74, 0x20 };
    static byte[] AUDIO_FORMAT = new byte[] { 0x01, 0x00 };
    static byte[] SUBCHUNK_ID = new byte[] { 0x64, 0x61, 0x74, 0x61 };
    private const int BYTES_PER_SAMPLE = 2;

    public static void WriteHeader(
            System.IO.Stream targetStream,
            int byteStreamSize,
            int channelCount,
            int sampleRate)
    {

        int byteRate = sampleRate * channelCount * BYTES_PER_SAMPLE;
        int blockAlign =  BYTES_PER_SAMPLE;

        targetStream.Write(RIFF_HEADER, 0, RIFF_HEADER.Length);
        targetStream.Write(PackageInt(byteStreamSize + 36, 4), 0, 4);

        targetStream.Write(FORMAT_WAVE, 0, FORMAT_WAVE.Length);
        targetStream.Write(FORMAT_TAG, 0, FORMAT_TAG.Length);
        targetStream.Write(PackageInt(16, 4), 0, 4);//Subchunk1Size    

        targetStream.Write(AUDIO_FORMAT, 0, AUDIO_FORMAT.Length);//AudioFormat   
        targetStream.Write(PackageInt(channelCount, 2), 0, 2);
        targetStream.Write(PackageInt(sampleRate, 4), 0, 4);
        targetStream.Write(PackageInt(byteRate, 4), 0, 4);
        targetStream.Write(PackageInt(blockAlign, 2), 0, 2);
        targetStream.Write(PackageInt(BYTES_PER_SAMPLE * 8), 0, 2);
        //targetStream.Write(PackageInt(0,2), 0, 2);//Extra param size
        targetStream.Write(SUBCHUNK_ID, 0, SUBCHUNK_ID.Length);
        targetStream.Write(PackageInt(byteStreamSize, 4), 0, 4);
    }

    static byte[] PackageInt(int source, int length = 2)
    {
        if ((length != 2) && (length != 4))
            throw new ArgumentException("length must be either 2 or 4", "length");
        var retVal = new byte[length];
        retVal[0] = (byte)(source & 0xFF);
        retVal[1] = (byte)((source >> 8) & 0xFF);
        if (length == 4)
        {
            retVal[2] = (byte)((source >> 0x10) & 0xFF);
            retVal[3] = (byte)((source >> 0x18) & 0xFF);
        }
        return retVal;
    }
}

Using the Code

Once you've gotten the wave stream only a few lines of code are needed to do the work. For the example program I am downloading a spoken phrase from the Microsoft Translation service, amplifying it, and then writing both the original and amplified versions to a file.

static void Main(string[] args)
{
    PcmData pcm;

    //Download the WAVE stream
    MicrosoftTranslatorService.LanguageServiceClient client = new LanguageServiceClient();            
    string waveUrl = client.Speak(APP_ID, "this is a volume test", "en", "audio/wav","");
    WebClient wc = new WebClient();
    var soundData = wc.DownloadData(waveUrl);

          
    //Load the WAVE stream and let it's amplitude be adjusted to 99% maximum
    using (var ms = new MemoryStream(soundData))
    {
        pcm = new PcmData(ms, true);               
    }

    //Write the amplified stream to a file
    using (Stream s = new FileStream("amplified.wav", FileMode.Create, FileAccess.Write))
    {
        pcm.Write(s);
    }

    //write the original unaltered stream to a file
    using (Stream s = new FileStream("original.wav", FileMode.Create, FileAccess.Write))
    {
        s.Write(soundData,0,soundData.Length);
    }
}

The End Result

The code works as designed, but I found a few scenarios that can make it ineffective. One scenario is that not all phones have the same response frequency for their speakers. Frequencies that comes through loud and clear on one phone may come through sounding quieter on another. The other scenario is that the source files may have a sample that goes to the maximum or minimum reading even though a majority of the other samples may come no where near to the same level of amplitude. When this occurs the spurious sample will limit the amount of amplification that is applied to the file. I opened an original and amplified WAVE file in audacity to see my results and I was pleased to see that the amplified WAVE does actually look louder when I view it's graph in audacity.

Part 2 - Overlaying Wave Files

The other problem that this code can solve is combining wave files together in various ways. I'll be putting that up in the next post. Between now and then I've got a presentation at the Windows Phone Developers Atlanta meeting this week (if you are in the Atlanta area come on out!) and will get back to this code after the presentation.

Announcing Windows Phone Developers Atlanta

I wanted to take a moment to announce a meetup group for Windows Phone 7 developers based out of Atlanta. I'm one of the cofounders and organizers for this group. We've had several meetings but I wanted to wait until we got into a routine before making an announcement. I've been pretty comfortable with how things are going so I'm sharing with you now. 

The information for our meetings is posted on Meetup.com ( http://www.meetup.com/Win-Phone-7-Developers-Atlanta/ ). We meet one Tuesday per month (except for December, for which we've decided to have no meetings because of end-of-the year demands and holidays). Our next meeting will be Tuesday 24 January 2012. This month I'll be presenting on Windows Azure for Windows Phone developers and we may have one other person presenting on Localization. After the meeting I'll share the notes (and possibly a recording of the presentation).  

Referencing Pages in Other Assemblies

I''ve been working with Google APIs recently and most of the ones that I've used require OAuth2 authentication so that the user can grant the application access to their data. Rather than copy-and-paste the OAUTH2 implementation to the different projects that I'm using I decided to make a class library that included the authentication code. From a usage standpoint I wanted to to do something similar to the the tasks and choosers where you create an object, call a method, and after the component does its magic you get back the item that the user chose. 

The format for the URI to another assembly that you would use in Silverlight looked like the following

new Uri("{assemblyName};component/{pagePath.xaml}", UriKind.Relative);

Of course here {assemblyName} and {pagePath.xaml} are placeholders for your actual assembly namd and pathway to the page and not literal values here. It took me more time than I care to admit to figure out why this would not work on Windows Phone 7. The correct format to use on Windows Phone 7 is as follows:

new Uri("/{assemblyName};component/{pagePath.xaml}", UriKind.Relative);

See the difference? It's subtle, but the difference is the little forward slash at the begining of the string. Had I paid closer attention to the exception message that was returned I would have realized this. 

The one thing I don't like about this method is when one navigates away from the page any program state that is saved in the page's code-behind is going to be lost. I also plan to provide an alternative method for showing the OAUTH in a user control. Either method has its advantages and disadvantages. I'm alsmost done with the code and will be posting it here later this week.