Visual Studio 2008 has supports “Enable Instruction Functions” options (see a project settings -> C/C++ -> Optimization). Note that this option can enlarge code.
Also memcpy function implementation has written with using sse2 (movdqa).
int CopyMemSSE4(int* piDst, int* piSrc, unsigned long SizeInBytes)
{
// Initialize pointers to start of the USWC memory
_asm
{
mov esi, piSrc
mov edx, piSrc
// Initialize pointer to end of the USWC memory
add edx, SizeInBytes
// Initialize pointer to start of the cacheable WB buffer
mov edi, piDst
// Start of Bulk Load loop
inner_start:
// Load data from USWC Memory using Streaming Load
MOVNTDQA xmm0, xmmword ptr [esi]
MOVNTDQA xmm1, xmmword ptr [esi+16]
MOVNTDQA xmm2, xmmword ptr [esi+32]
MOVNTDQA xmm3, xmmword ptr [esi+48]
// Copy data to buffer
MOVDQA xmmword ptr [edi], xmm0
MOVDQA xmmword ptr [edi+16], xmm1
MOVDQA xmmword ptr [edi+32], xmm2
MOVDQA xmmword ptr [edi+48], xmm3
// Increment pointers by cache line size and test for end of loop
add esi, 040h
add edi, 040h
cmp esi, edx
jne inner_start
}
// End of Bulk Load loop
return 0;
}
#define DATA_SIZE 0x01000000
int main(int argc, char* argv[])
{
int *piSrc = NULL;
int *piDst = NULL;
unsigned long dwDataSizeInBytes = sizeof(int) * DATA_SIZE;
piSrc = (int *)_aligned_malloc(dwDataSizeInBytes, dwDataSizeInBytes);
piDst = (int *)_aligned_malloc(dwDataSizeInBytes, dwDataSizeInBytes);
memset(piSrc, 255, dwDataSizeInBytes);
memset(piDst, 0, dwDataSizeInBytes);
CopyMemSSE4(piDst, piSrc, dwDataSizeInBytes);
_aligned_free(piSrc);
_aligned_free(piDst);
}
Additional links:
Note: Based on some feedback I should clarify that this does not cover C99 syntax
Even though the C programming language has been around since the late 1960’s, many programmers still have trouble understanding how C declarations are formed. This is not unsurprising due to the complexity that can arise when mixing pointer, array and function-pointer declarations.
In this posting we shall look at some complex declarations to try and understand them by considering how they are formed. The intent is not so you can go off and write wonderfully complex declarations, but more hopefully you may actually be able to understand someone else’s code. Finally we shall look at how most complex declarations can be easily simplified.
Here I’m going to focus on object declarations/definitions rather than functions. Also, in this posting I’m not going to examine structure, union or enumeration specifies. They’ll keep for another day.
Note: This paper is derived from the book The Security Development Lifecycle, by Michael Howard and Steve Lipner, Microsoft Press, 2006.
Prohibiting the use of banned APIs is a good way to remove a significant number of code vulnerabilities — this practice is reflected in Stage 6 of The Microsoft Security Development Lifecycle: “Establish and Follow Best Practices for Development.” It can also be referenced in Chapter 11 of the Microsoft Press Book The Security Development Lifecycle.
When the C runtime library (CRT) was first created about 25 years ago, the threats to computers were different; machines were not as interconnected as they are today, and attacks were not as prevalent. With this in mind, a subset of the C runtime library must be deprecated for new code and, over time, removed from earlier code. It’s just too easy to get code wrong that uses these outdated functions. Even some of the classic replacement functions are prone to error, too.
This list is the SDL view of what comprises banned APIs; it is derived from experience with real-world security bugs and focuses almost exclusively on functions that can lead to buffer overruns (Howard, LeBlanc, and Viega 2005). Any function in this section’s tables must be replaced with a more secure version. Obviously, you cannot replace a banned API with another banned API. For example, replacing strcpy with strncpy is not valid because strncpy is banned, too.
Also note that some of the function names might be a little different, depending on whether the function takes ASCII, Unicode, _T (ASCII or Unicode), or multibyte chars. Some function names might include A or W at the end of the name. For example, the StrSafe StringCbCatEx function is also available as StringCbCatExW (Unicode) and StringCbCatExA (ASCII).
More info