diff options
author | giraffedata <giraffedata@9d0c8265-081b-0410-96cb-a4ca84ce46f8> | 2008-01-19 23:52:19 +0000 |
---|---|---|
committer | giraffedata <giraffedata@9d0c8265-081b-0410-96cb-a4ca84ce46f8> | 2008-01-19 23:52:19 +0000 |
commit | df4bd564d2a033685a886b3992fdc1a6b6b6bde6 (patch) | |
tree | aaad5792ca7cf761f638622047e8af7e35031b5d | |
parent | f3ddd5939dd512400cbb2ddbc247c5cd3d8a248d (diff) | |
download | netpbm-mirror-df4bd564d2a033685a886b3992fdc1a6b6b6bde6.tar.gz netpbm-mirror-df4bd564d2a033685a886b3992fdc1a6b6b6bde6.tar.xz netpbm-mirror-df4bd564d2a033685a886b3992fdc1a6b6b6bde6.zip |
Rewrite low-memory algorithm; use temp files and conserve virtual as well as real memory
git-svn-id: http://svn.code.sf.net/p/netpbm/code/trunk@566 9d0c8265-081b-0410-96cb-a4ca84ce46f8
-rw-r--r-- | doc/HISTORY | 3 | ||||
-rw-r--r-- | editor/pamflip.c | 1143 |
2 files changed, 690 insertions, 456 deletions
diff --git a/doc/HISTORY b/doc/HISTORY index b99fe3bb..d74b86a4 100644 --- a/doc/HISTORY +++ b/doc/HISTORY @@ -6,6 +6,9 @@ CHANGE HISTORY not yet BJH Release 10.42.00 + pamflip: Rewrite low-memory algorithm; use temp files and + conserve virtual as well as real memory. + pamundice: Fix bogus error about missing "-instem" option. Add pm_tmpfile_fd() and pm_make_tmpfile_fd(). diff --git a/editor/pamflip.c b/editor/pamflip.c index fa3b488d..e47daf36 100644 --- a/editor/pamflip.c +++ b/editor/pamflip.c @@ -11,82 +11,47 @@ */ /* - transformGen() is the general transformation function. + transformNonPbmChunk() is the general transformation function. + It can tranform anything, albeit slowly and expensively. The following are enhancements for specific cases: - transformRowByRowPbm() + transformRowByRowPbm(): + PBM image with left-right or null transformation transformRowsBottomTopPbm() + PBM image with bottom-top transformation transformRowByRowNonPbm() + non-PBM image with left-right or null transformation + (also works for PBM, but we don't use it) transformRowsBottomTopNonPbm() + non-PBM image with bottom-top transformation + (also works for PBM, but we don't use it) transformPbm() - - Although we use transformGen() only when none of the enhancement - functions apply, it is capable of handling all cases. (Only that it - is slow, and uses more memory.) In the same manner, transformPbm() is - capable of handling all pbm transformations and transformRowByRowNonPbm() - transformRowsBottomTopNonPbm() are capable of handling pbm. - - - There is some fancy virtual memory management in transformGen() to avoid - page thrashing when you flip a large image in a columns-for-rows - way (e.g. -transpose). - - The page thrashing we're trying to avoid could happen because the - output of the transformation is stored in an array of tuples in - virtual memory. A tuple array is stored in column-first order, - meaning that all the columns of particular row are contiguous, the - next row is next to that, etc. If you fill up that array by - filling in Column 0 sequentially in every row from top to bottom, - you will touch a lot of different virtual memory pages, and every - one has to be paged in as you touch it. - - If the number of virtual memory pages you touch exceeds the amount - of real memory the process can get, then by the time you hit the bottom - of the tuple array, the pages that hold the top are already paged out. - So if you go back and do Column 1 from top to bottom, you will again - touch lots of pages and have to page in every one of them. Do this - for 100 columns, and you might page in every page in the array 100 times - each, putting a few bytes in the page each time. - - That is very expensive. Instead, you'd like to keep the same pages in - real memory as long as possible and fill them up as much as you can - before paging them out and working on a new set of pages. You can do - that by doing Column 0 from top to say Row 10, then Column 1 from top - to Row 10, etc. all the way across the image. Assuming 10 rows fits - in real memory, you will keep the virtual memory for the first 10 rows - of the tuple array in real memory until you've filled them in completely. - Now you go back and do Column 0 from Row 11 to Row 20, then Column 1 - from Row 11 to Row 20, and so on. - - So why are we even trying to fill in column by column instead of just - filling in row by row? Because we're reading an input image row by row - and transforming it in such a way that a row of the input becomes - a column of the output. In order to fill in a whole row of the output, - we'd have to read a whole column of the input, and then we have the same - page thrashing problem in the input. - - So the optimal procedure is to do N output rows in each pass, where - N is the largest number of rows we can fit in real memory. In each - pass, we read certain columns of every row of the input and output - every column of certain rows of the output. The output area for - the rows in the pass gets paged in once during the pass and then - never again. Note that some pages of every row of the input get - paged in once in each pass too. As each input page is referenced - only in one burst, input pages do not compete with output pages for - real memory -- the working set is the output pages, which get referenced - cyclically. - - This all worked when we used the pnm xel format, but now that we - use the pam tuple format, there's an extra memory reference that - could be causing trouble. Because tuples have varying depth, a pam - row is an array of pointers to the tuples. To access a tuple, we - access the tuple pointer, then the tuple. We could probably do better, - because the samples are normally in the memory immediately following - the tuple pointers, so we could compute where a tuple's samples live - without actually loading the tuple address from memory. I.e. the - address of the tuple for Column 5 of Row 9 of a 3-deep 100-wide - image is (void*)tuples[9] + 100 * sizeof(tuple*) + 5*(3*sizeof(sample)). + PBM image with column-for-row transformation + (also works for all other transformations, but we don't use it) + transformNonPbmWhole() + non-PBM column-for-row transformation + (also works for PBM, but we don't use it) + + Except we don't use any of the above if we can't fit the entire image + into the amount of memory the user gave us to work with. In that + case, we fall back to transformNonPbmChunk(). + + Note that while virtual memory can be limited (e.g. sometimes the + machine's 32 bit address space itself is a limit), real memory is + also a concern. With a row-for-column transformation + (e.g. -transpose), if we can't fit the entire image into _real_ + memory, we will page thrash within an inch our lives. So we count + on the user to tell us how much real memory we should expect to + get -- his -memsize option is the lesser of the available virtual + and real memory. + + Before Netpbm 10.42 (March 2008), we had a different method for + dealing with memory shortage. We didn't worry about virtual memory + at all, and always kept the whole target image in memory. We did + not use temporary files. But to avoid page thrashing in a + column-for-row transformation, we transformed in column stripes of + the source image, reading the input image through multiple times. */ #define _BSD_SOURCE 1 /* Make sure strdup() is in string.h */ @@ -104,22 +69,6 @@ enum xformType {LEFTRIGHT, TOPBOTTOM, TRANSPOSE}; -struct cmdlineInfo { - /* All the information the user supplied in the command line, - in a form easy for the program to use. - */ - const char *inputFilespec; /* Filespec of input file */ - unsigned int xformCount; - /* Number of transforms in the 'xformType' array */ - enum xformType xformList[10]; - /* Array of transforms to be applied, in order */ - unsigned int availableMemory; - unsigned int pageSize; - unsigned int verbose; -}; - - - static void parseXformOpt(const char * const xformOpt, unsigned int * const xformCountP, @@ -165,6 +114,175 @@ parseXformOpt(const char * const xformOpt, +/* See transformPoint() for an explanation of the transform matrix types. + The difference between the two types is that 'xformCore' is particular + to the source image dimensions and can be used to do the transformation, + while 'xformCore' is independent of the source image and just + tells what kind of transformation. +*/ + +struct xformCore { + /* a b + c d + */ + int a; /* -1, 0, or 1 */ + int b; /* -1, 0, or 1 */ + int c; /* -1, 0, or 1 */ + int d; /* -1, 0, or 1 */ +}; + + + +struct xformMatrix { + /* a b 0 + c d 0 + e f 1 + */ + int a; /* -1, 0, or 1 */ + int b; /* -1, 0, or 1 */ + int c; /* -1, 0, or 1 */ + int d; /* -1, 0, or 1 */ + int e; /* 0 or maximum column number in target image */ + int f; /* 0 or maximum row number in target image */ +}; + + + +static void +leftright(struct xformCore * const xformP) { + xformP->a = - xformP->a; + xformP->c = - xformP->c; +} + + + +static void +topbottom(struct xformCore * const xformP) { + xformP->b = - xformP->b; + xformP->d = - xformP->d; +} + + + +static void +swap(int * const xP, int * const yP) { + + int const t = *xP; + + *xP = *yP; + *yP = t; +} + + + +static void +transpose(struct xformCore * const xformP) { + swap(&xformP->a, &xformP->b); + swap(&xformP->c, &xformP->d); +} + + + +static void +computeXformCore(unsigned int const xformCount, + enum xformType const xformType[], + struct xformCore * const xformP) { + + struct xformCore const nullTransform = {1, 0, 0, 1}; + + unsigned int i; + + *xformP = nullTransform; /* initial value */ + + for (i = 0; i < xformCount; ++i) { + switch (xformType[i]) { + case LEFTRIGHT: + leftright(xformP); + break; + case TOPBOTTOM: + topbottom(xformP); + break; + case TRANSPOSE: + transpose(xformP); + break; + } + } +} + + + +static void +xformDimensions(struct xformCore const xform, + unsigned int const inCols, + unsigned int const inRows, + unsigned int * const outColsP, + unsigned int * const outRowsP) { +/*---------------------------------------------------------------------------- + Compute the output dimensions from the input dimensions given a + matrix that describes a transformation. + + E.g. if it's a 90 degree rotation of a 10 x 20 image, the output is + a 20 x 10 image. +-----------------------------------------------------------------------------*/ + *outColsP = abs(xform.a * inCols + xform.c * inRows); + *outRowsP = abs(xform.b * inCols + xform.d * inRows); +} + + + +static void +computeXformMatrix(struct xformMatrix * const xformP, + unsigned int const sourceCols, + unsigned int const sourceRows, + struct xformCore const xformCore) { + + int colMax = xformCore.a * (sourceCols-1) + xformCore.c * (sourceRows-1); + int rowMax = xformCore.b * (sourceCols-1) + xformCore.d * (sourceRows-1); + + xformP->a = xformCore.a; + xformP->b = xformCore.b; + xformP->c = xformCore.c; + xformP->d = xformCore.d; + xformP->e = colMax < 0 ? -colMax : 0; + xformP->f = rowMax < 0 ? -rowMax : 0; +} + + + +struct cmdlineInfo { + /* All the information the user supplied in the command line, + in a form easy for the program to use. + */ + const char * inputFilespec; /* Filespec of input file */ + struct xformCore xform; + /* Handy mathematical representation of all of transform options */ + size_t availableMemory; + unsigned int pageSize; + unsigned int verbose; +}; + + + +static void +interpretMemorySize(unsigned int const memsizeSpec, + unsigned int const memsizeOpt, + size_t * const availableMemoryP) { + + size_t const sizeMax = (size_t)(-1); + unsigned int const Meg = 1024 * 1024; + + if (memsizeSpec) { + if (memsizeOpt > sizeMax / Meg) + pm_error("-memsize value too large: %u MiB. Maximum this program " + "can handle is %u MiB", + memsizeOpt, sizeMax / Meg); + *availableMemoryP = memsizeOpt * Meg; + } else + *availableMemoryP = sizeMax; +} + + + static void parseCommandLine(int argc, char ** const argv, struct cmdlineInfo * const cmdlineP) { @@ -182,7 +300,11 @@ parseCommandLine(int argc, char ** const argv, unsigned int lr, tb, xy, r90, r270, r180, null; unsigned int memsizeSpec, pagesizeSpec, xformSpec; unsigned int memsizeOpt; - const char *xformOpt; + const char * xformOpt; + unsigned int xformCount; + /* Number of transforms in the 'xformType' array */ + enum xformType xformList[10]; + /* Array of transforms to be applied, in order */ MALLOCARRAY(option_def, 100); @@ -221,43 +343,38 @@ parseCommandLine(int argc, char ** const argv, pm_error("You may specify only one type of flip."); if (lr + tb + xy + r90 + r180 + r270 + null == 1) { if (lr) { - cmdlineP->xformCount = 1; - cmdlineP->xformList[0] = LEFTRIGHT; + xformCount = 1; + xformList[0] = LEFTRIGHT; } else if (tb) { - cmdlineP->xformCount = 1; - cmdlineP->xformList[0] = TOPBOTTOM; + xformCount = 1; + xformList[0] = TOPBOTTOM; } else if (xy) { - cmdlineP->xformCount = 1; - cmdlineP->xformList[0] = TRANSPOSE; + xformCount = 1; + xformList[0] = TRANSPOSE; } else if (r90) { - cmdlineP->xformCount = 2; - cmdlineP->xformList[0] = TRANSPOSE; - cmdlineP->xformList[1] = TOPBOTTOM; + xformCount = 2; + xformList[0] = TRANSPOSE; + xformList[1] = TOPBOTTOM; } else if (r180) { - cmdlineP->xformCount = 2; - cmdlineP->xformList[0] = LEFTRIGHT; - cmdlineP->xformList[1] = TOPBOTTOM; + xformCount = 2; + xformList[0] = LEFTRIGHT; + xformList[1] = TOPBOTTOM; } else if (r270) { - cmdlineP->xformCount = 2; - cmdlineP->xformList[0] = TRANSPOSE; - cmdlineP->xformList[1] = LEFTRIGHT; + xformCount = 2; + xformList[0] = TRANSPOSE; + xformList[1] = LEFTRIGHT; } else if (null) { - cmdlineP->xformCount = 0; + xformCount = 0; } } else if (xformSpec) - parseXformOpt(xformOpt, &cmdlineP->xformCount, cmdlineP->xformList); + parseXformOpt(xformOpt, &xformCount, xformList); else pm_error("You must specify an option such as -topbottom to indicate " "what kind of flip you want."); - if (memsizeSpec) { - if (memsizeOpt > UINT_MAX / 1024 / 1024) - pm_error("-memsize value too large: %u MiB. Maximum this program " - "can handle is %u MiB", - memsizeOpt, UINT_MAX / 1024 / 1024); - cmdlineP->availableMemory = memsizeOpt * 1024 *1024; - } else - cmdlineP->availableMemory = UINT_MAX; + computeXformCore(xformCount, xformList, &cmdlineP->xform); + + interpretMemorySize(memsizeSpec, memsizeOpt, &cmdlineP->availableMemory); if (!pagesizeSpec) cmdlineP->pageSize = 4*1024; @@ -273,109 +390,6 @@ parseCommandLine(int argc, char ** const argv, -/* See transformPoint() for an explanation of the transform matrix types. - The difference between the two types is that 'xformMatrix' is particular - to the source image dimensions and can be used to do the transformation, - while 'normalizedXformMatrix' is independent of the source image. -*/ - -struct normalizedXformMatrix { - int a; /* -1, 0, or 1 */ - int b; /* -1, 0, or 1 */ - int c; /* -1, 0, or 1 */ - int d; /* -1, 0, or 1 */ -}; - - - -struct xformMatrix { - int a; /* -1, 0, or 1 */ - int b; /* -1, 0, or 1 */ - int c; /* -1, 0, or 1 */ - int d; /* -1, 0, or 1 */ - int e; /* 0 or maximum column number in target image */ - int f; /* 0 or maximum row number in target image */ -}; - - - -static void -leftright(struct normalizedXformMatrix * const xformP) { - xformP->a = - xformP->a; - xformP->c = - xformP->c; -} - - - -static void -topbottom(struct normalizedXformMatrix * const xformP) { - xformP->b = - xformP->b; - xformP->d = - xformP->d; -} - - - -static void -swap(int * const xP, int * const yP) { - - int const t = *xP; - - *xP = *yP; - *yP = t; -} - - - -static void -transpose(struct normalizedXformMatrix * const xformP) { - swap(&xformP->a, &xformP->b); - swap(&xformP->c, &xformP->d); -} - - - -static void -computeXformMatrix(struct xformMatrix * const xformP, - unsigned int const sourceCols, - unsigned int const sourceRows, - unsigned int const xformCount, - enum xformType const xformType[]) { - - struct normalizedXformMatrix const nullTransform = {1, 0, 0, 1}; - - struct normalizedXformMatrix nXform; - unsigned int i; - - nXform = nullTransform; /* initial value */ - - for (i = 0; i < xformCount; ++i) { - switch (xformType[i]) { - case LEFTRIGHT: - leftright(&nXform); - break; - case TOPBOTTOM: - topbottom(&nXform); - break; - case TRANSPOSE: - transpose(&nXform); - break; - } - } - { - int colMax = nXform.a * (sourceCols-1) + nXform.c * (sourceRows-1); - int rowMax = nXform.b * (sourceCols-1) + nXform.d * (sourceRows-1); - - xformP->a = nXform.a; - xformP->b = nXform.b; - xformP->c = nXform.c; - xformP->d = nXform.d; - xformP->e = colMax < 0 ? -colMax : 0; - xformP->f = rowMax < 0 ? -rowMax : 0; - } -} - - - static void bitOrderReverse(unsigned char * const bitrow, unsigned int const cols) { @@ -495,8 +509,6 @@ transformRowByRowNonPbm(struct pam * const inpamP, scratchTuplerow = NULL; newtuplerow = tuplerow; } - pnm_writepaminit(outpamP); - for (row = 0; row < inpamP->height ; ++row) { pnm_readpamrow(inpamP, tuplerow); pnm_writepamrow(outpamP, newtuplerow); @@ -510,32 +522,11 @@ transformRowByRowNonPbm(struct pam * const inpamP, static void -transformRowByRow(struct pam * const inpamP, - struct pam * const outpamP, - bool const reverse, - bool const verbose) { - - if (verbose) - pm_message("Transforming row by row, top to bottom"); - - switch (PNM_FORMAT_TYPE(inpamP->format)) { - case PBM_TYPE: - transformRowByRowPbm(inpamP, outpamP, reverse); - break; - default: - transformRowByRowNonPbm(inpamP, outpamP, reverse); - break; - } -} - - - -static void transformRowsBottomTopPbm(struct pam * const inpamP, struct pam * const outpamP, bool const reverse) { /*---------------------------------------------------------------------------- - Flip a PBM image top for bottom. Iff 'reverse', also flip it left for right. + Flip a PBM image top for bottom. Read complete image into memory in packed PBM format; Use fast packed PBM bit reverse algorithm (where required). @@ -567,80 +558,42 @@ transformRowsBottomTopPbm(struct pam * const inpamP, static void transformRowsBottomTopNonPbm(struct pam * const inpamP, - struct pam * const outpamP, - bool const reverse) { + struct pam * const outpamP) { /*---------------------------------------------------------------------------- - Read complete image into memory as a tuple array. + Do a simple vertical flip. + + We do this faster than the more general subroutines because we just + move the row pointers. + + But we must fit the entire image into memory at once. - This can do any transformation except a column-for-row transformation, - on any type of image, but is slower and uses more memory than the - PBM-only transformRowsBottomTopPbm(). + This works on PBM, but the PBM-specific subroutine is faster. -----------------------------------------------------------------------------*/ tuple ** tuplerows; - tuple * scratchTuplerow; - /* This is not a full tuple row -- just an array of pointers to - the tuples in 'tuplerows'. - */ unsigned int row; - if (reverse) - MALLOCARRAY_NOFAIL(scratchTuplerow, inpamP->width); - else - scratchTuplerow = NULL; - tuplerows = pnm_allocpamarray(outpamP); for (row = 0; row < inpamP->height ; ++row) pnm_readpamrow(inpamP, tuplerows[row]); - pnm_writepaminit(outpamP); - - for (row = 0; row < inpamP->height ; ++row) { - tuple * newtuplerow; + for (row = 0; row < inpamP->height; ++row) { tuple * const tuplerow = tuplerows[inpamP->height - row - 1]; - if (reverse) { - unsigned int col; - newtuplerow = scratchTuplerow; - for (col = 0; col < inpamP->width; ++col) - newtuplerow[col] = tuplerow[inpamP->width - col - 1]; - } else - newtuplerow = tuplerow; - pnm_writepamrow(outpamP, newtuplerow); - } - if (scratchTuplerow) - free(scratchTuplerow); + pnm_writepamrow(outpamP, tuplerow); + } pnm_freepamarray(tuplerows, outpamP); } -static void -transformRowsBottomTop(struct pam * const inpamP, - struct pam * const outpamP, - bool const reverse, - bool const verbose) { - - if (PNM_FORMAT_TYPE(inpamP->format) == PBM_TYPE) { - if (verbose) - pm_message("Transforming PBM row by row, bottom to top"); - transformRowsBottomTopPbm(inpamP, outpamP, reverse); - } else { - if (verbose) - pm_message("Transforming non-PBM row by row, bottom to top"); - transformRowsBottomTopNonPbm(inpamP, outpamP, reverse); - } -} - - - static void __inline__ transformPoint(int const col, int const row, struct xformMatrix const xform, - int * const newcolP, - int * const newrowP ) { + unsigned int * const newcolP, + unsigned int * const newrowP ) { /*---------------------------------------------------------------------------- Compute the location in the output of a pixel that is at row 'row', column 'col' in the input. Assume the output image is 'newcols' by @@ -671,9 +624,21 @@ transformPoint(int const col, static void -transformPbm(struct pam * const inpamP, - struct pam * const outpamP, - struct xformMatrix const xform) { +writeRaster(struct pam * const pamP, + tuple * const * const tuples) { + + unsigned int outRow; + + for (outRow = 0; outRow < pamP->height; ++ outRow) + pnm_writepamrow(pamP, tuples[outRow]); +} + + + +static void +transformPbmGen(struct pam * const inpamP, + struct pam * const outpamP, + struct xformCore const xformCore) { /*---------------------------------------------------------------------------- This is the same as transformGen, except that it uses less memory, since the PBM buffer format uses one bit per pixel instead @@ -683,26 +648,29 @@ transformPbm(struct pam * const inpamP, memory than the more restricted transformRowByRowPbm() and transformRowsBottomTopPbm(). -----------------------------------------------------------------------------*/ - bit* bitrow; - bit** newbits; - int row; + bit * bitrow; + bit ** newbits; + struct xformMatrix xform; + unsigned int row; + computeXformMatrix(&xform, inpamP->width, inpamP->height, xformCore); + bitrow = pbm_allocrow(inpamP->width); newbits = pbm_allocarray(pbm_packed_bytes(outpamP->width), outpamP->height); /* Initialize entire array to zeroes. One bits will be or'ed in later */ for (row = 0; row < outpamP->height; ++row) { - int col; + unsigned int col; for (col = 0; col < pbm_packed_bytes(outpamP->width); ++col) newbits[row][col] = 0; } for (row = 0; row < inpamP->height; ++row) { - int col; + unsigned int col; pbm_readpbmrow(inpamP->file, bitrow, inpamP->width, inpamP->format); for (col = 0; col < inpamP->width; ++col) { - int newcol, newrow; + unsigned int newcol, newrow; transformPoint(col, row, xform, &newcol, &newrow); newbits[newrow][newcol/8] |= bitrow[col] << (7 - newcol % 8); /* Use of "|=" patterned after pbm_readpbmrow_packed. */ @@ -711,8 +679,7 @@ transformPbm(struct pam * const inpamP, pbm_writepbminit(outpamP->file, outpamP->width, outpamP->height, 0); for (row = 0; row < outpamP->height; ++row) - pbm_writepbmrow_packed(outpamP->file, newbits[row], outpamP->width, - 0); + pbm_writepbmrow_packed(outpamP->file, newbits[row], outpamP->width, 0); pbm_freearray(newbits, outpamP->height); pbm_freerow(bitrow); @@ -720,66 +687,11 @@ transformPbm(struct pam * const inpamP, -static unsigned int -optimalSegmentSize(struct xformMatrix const xform, - struct pam * const pamP, - unsigned int const availableMemory, - unsigned int const pageSize) { -/*---------------------------------------------------------------------------- - Compute the maximum number of columns that can be transformed, one row - at a time, without causing page thrashing. - - See comments at the top of this file for an explanation of the kind - of page thrashing using segments avoids. - - 'availableMemory' is the amount of real memory in bytes that this - process should expect to be able to use. - - 'pageSize' is the size of a page in bytes. A page means the unit that - is paged in or out. - - 'pamP' describes the storage required to represent a row of the - output array. ------------------------------------------------------------------------------*/ - unsigned int segmentSize; - - if (xform.b == 0) - segmentSize = pamP->width; - else { - unsigned int const otherNeeds = 200*1024; - /* A wild guess at how much real memory is needed by the program - for purposes other than the output tuple array. - */ - if (otherNeeds > availableMemory) - segmentSize = pamP->width; /* Can't prevent thrashing */ - else { - unsigned int const availablePages = - (availableMemory - otherNeeds) / pageSize; - if (availablePages <= 1) - segmentSize = pamP->width; /* Can't prevent thrashing */ - else { - unsigned int const bytesPerRow = - pamP->width * pamP->depth * pamP->bytes_per_sample; - unsigned int rowsPerPage = - MAX(1, (pageSize + (pageSize/2)) / bytesPerRow); - /* This is how many consecutive rows we can touch - on average while staying within the same page. - */ - segmentSize = availablePages * rowsPerPage; - } - } - } - return segmentSize; -} - - - static void -transformNonPbm(struct pam * const inpamP, - struct pam * const outpamP, - struct xformMatrix const xform, - unsigned int const segmentSize, - bool const verbose) { +transformNonPbmWhole(struct pam * const inpamP, + struct pam * const outpamP, + struct xformCore const xformCore, + bool const verbose) { /*---------------------------------------------------------------------------- Do the transform using "pam" library functions, as opposed to "pbm" ones. @@ -787,62 +699,39 @@ transformNonPbm(struct pam * const inpamP, Assume input file is positioned to the raster (just after the header). - 'segmentSize' is the number of columns we are to process in each - pass. We do each segment going from left to right. For each - segment, we do each row, going from top to bottom. For each row of - the segment, we do each column, going from left to right. (The - reason Caller wants it done by segments is to improve virtual memory - reference locality. See comments at the top of this file). - - if 'segmentSize' is less than the whole image, ifP must be a seekable - file. - This can do any transformation, but is slower and uses more memory - than the PBM-only transformPbm(). + than some of the alternatives which are usable only for certain + cases. But we do require a certain amount of virtual and real memory; + transformNonPbmChunk() is even more general in that it doesn't. -----------------------------------------------------------------------------*/ - pm_filepos imagepos; - /* The input file position of the raster. But defined only if - segment size is less than whole image. - */ - tuple* tuplerow; - tuple** newtuples; - unsigned int startCol; + tuple * tuplerow; + tuple ** newtuples; + struct xformMatrix xform; + unsigned int row; + computeXformMatrix(&xform, inpamP->width, inpamP->height, xformCore); + tuplerow = pnm_allocpamrow(inpamP); newtuples = pnm_allocpamarray(outpamP); - if (segmentSize < inpamP->width) - pm_tell2(inpamP->file, &imagepos, sizeof(imagepos)); + for (row = 0; row < inpamP->height; ++row) { + unsigned int col; + pnm_readpamrow(inpamP, tuplerow); + + for (col = 0; col < inpamP->width; ++col) { + unsigned int newcol, newrow; - for (startCol = 0; startCol < inpamP->width; startCol += segmentSize) { - /* Do one set of columns which is small enough not to cause - page thrashing. - */ - unsigned int const endCol = MIN(inpamP->width, startCol + segmentSize); - unsigned int row; + transformPoint(col, row, xform, &newcol, &newrow); - if (verbose) - pm_message("Transforming Columns %u up to %u", - startCol, endCol); - - if (startCol > 0) - /* Go back to read from Row 0 again */ - pm_seek2(inpamP->file, &imagepos, sizeof(imagepos)); - - for (row = 0; row < inpamP->height; ++row) { - unsigned int col; - pnm_readpamrow(inpamP, tuplerow); - - for (col = startCol; col < endCol; ++col) { - int newcol, newrow; - transformPoint(col, row, xform, &newcol, &newrow); - pnm_assigntuple(inpamP, newtuples[newrow][newcol], - tuplerow[col]); - } + assert(newcol < outpamP->width); + assert(newrow < outpamP->height); + + pnm_assigntuple(inpamP, newtuples[newrow][newcol], + tuplerow[col]); } } - pnm_writepam(outpamP, newtuples); + writeRaster(outpamP, newtuples); pnm_freepamarray(newtuples, outpamP); pnm_freepamrow(tuplerow); @@ -850,40 +739,396 @@ transformNonPbm(struct pam * const inpamP, +typedef struct { +/*---------------------------------------------------------------------------- + A description of the quilt of cells that make up the output image. +-----------------------------------------------------------------------------*/ + unsigned int ranks, files; + /* Dimensions of the output image in cells */ + struct pam ** pam; + /* pam[y][x] is the pam structure for the cell at rank y, file x + in the output. + */ + + /* Each of the cells corresponds to a temporary file; the 'file' + member of its pam structure identifies it. But it is not a normal + Netpbm file; it contains only the raster portion. The program + writes the raster to the file, starting at offset zero, then rewinds + and reads it out later. The header is unnecessary because the pam + structure is still available at readback time. + */ +} outputMap; + + + +static void +initOutCell(struct pam * const outCellPamP, + unsigned int const inCellWidth, + unsigned int const inCellHeight, + struct pam * const inpamP, + struct xformCore const xformCore) { + + unsigned int outCellFiles, outCellRanks; + + *outCellPamP = *inpamP; /* initial value */ + + outCellPamP->len = PAM_STRUCT_SIZE(tuple_type); + + outCellPamP->file = pm_tmpfile(); + + xformDimensions(xformCore, inCellWidth, inCellHeight, + &outCellFiles, &outCellRanks); + + outCellPamP->width = outCellFiles; + outCellPamP->height = outCellRanks; +} + + + +static outputMap * +createOutputMap(struct pam * const inpamP, + unsigned int const maxRows, + struct xformMatrix const cellXform, + struct xformCore const xformCore) { + + unsigned int const inCellFiles = 1; + unsigned int const inCellRanks = (inpamP->height + maxRows - 1) / maxRows; + + outputMap * mapP; + unsigned int outCellRank; + unsigned int inCellRank; + + MALLOCVAR_NOFAIL(mapP); + + xformDimensions(xformCore, inCellFiles, inCellRanks, + &mapP->files, &mapP->ranks); + + MALLOCARRAY(mapP->pam, mapP->ranks); + if (mapP->pam == NULL) + pm_error("Could not allocate a cell array for %u ranks of cells.", + mapP->ranks); + + for (outCellRank = 0; outCellRank < mapP->ranks; ++outCellRank) { + + MALLOCARRAY(mapP->pam[outCellRank], mapP->files); + + if (mapP->pam[outCellRank] == NULL) + pm_error("Failed to allocate rank %u of the cell array, " + "%u cells wide", outCellRank, mapP->files); + } + + for (inCellRank = 0; inCellRank < inCellRanks; ++inCellRank) { + unsigned int const inCellFile = 0; + unsigned int const inCellStartRow = inCellRank * maxRows; + unsigned int const inCellRows = + MIN(inpamP->height - inCellStartRow, maxRows); + + unsigned int outCellFile, outCellRank; + transformPoint(inCellFile, inCellRank, cellXform, + &outCellFile, &outCellRank); + + initOutCell(&mapP->pam[outCellRank][outCellFile], + inpamP->width, inCellRows, + inpamP, xformCore); + } + return mapP; +} + + + +static void +destroyOutputMap(outputMap * const mapP) { + + unsigned int outCellRank; + + for (outCellRank = 0; outCellRank < mapP->ranks; ++outCellRank) + free(mapP->pam[outCellRank]); + + free(mapP->pam); + + free(mapP); +} + + + static void -transformGen(struct pam * const inpamP, - struct pam * const outpamP, - struct xformMatrix const xform, - unsigned int const availableMemory, - unsigned int const pageSize, - bool const verbose) { +rewindCellFiles(outputMap * const outputMapP) { + + unsigned int outCellRank; + + for (outCellRank = 0; outCellRank < outputMapP->ranks; ++outCellRank) { + unsigned int outCellFile; + for (outCellFile = 0; outCellFile < outputMapP->files; ++outCellFile) + pm_seek(outputMapP->pam[outCellRank][outCellFile].file, 0); + } +} + + + +static void +closeCellFiles(outputMap * const outputMapP) { + + unsigned int outCellRank; + + for (outCellRank = 0; outCellRank < outputMapP->ranks; ++outCellRank) { + unsigned int outCellFile; + for (outCellFile = 0; outCellFile < outputMapP->files; ++outCellFile) + pm_close(outputMapP->pam[outCellRank][outCellFile].file); + } +} + + + +static void +transformCell(struct pam * const inpamP, + struct pam * const outpamP, + struct xformCore const xformCore) { + + struct xformMatrix xform; + tuple * inTupleRow; + tuple ** outTuples; + unsigned int inRow; + + computeXformMatrix(&xform, inpamP->width, inpamP->height, xformCore); + + inTupleRow = pnm_allocpamrow(inpamP); + + outTuples = pnm_allocpamarray(outpamP); + + for (inRow = 0; inRow < inpamP->height; ++inRow) { + unsigned int inCol; + + pnm_readpamrow(inpamP, inTupleRow); + + for (inCol = 0; inCol < inpamP->width; ++inCol) { + unsigned int outCol, outRow; + + transformPoint(inCol, inRow, xform, &outCol, &outRow); + + assert(outCol < outpamP->width); + assert(outRow < outpamP->height); + + pnm_assigntuple(inpamP, + outTuples[outRow][outCol], inTupleRow[inCol]); + } + } + pnm_freepamrow(inTupleRow); + + writeRaster(outpamP, outTuples); + + pnm_freepamarray(outTuples, outpamP); +} + + + +static void +stitchCellsToOutput(outputMap * const outputMapP, + struct pam * const outpamP) { + + unsigned int outRank; + tuple * tupleRow; + + tupleRow = pnm_allocpamrow(outpamP); + + for (outRank = 0; outRank < outputMapP->ranks; ++outRank) { + unsigned int const cellRows = outputMapP->pam[outRank][0].height; + /* Number of rows in every cell in this rank */ + + unsigned int cellRow; + + for (cellRow = 0; cellRow < cellRows; ++cellRow) { + unsigned int outFile; + unsigned int outCol; + + outCol = 0; + + for (outFile = 0; outFile < outputMapP->files; ++outFile) { + struct pam * const outCellPamP = + &outputMapP->pam[outRank][outFile]; + + assert(outCellPamP->height == cellRows); + + assert(outCol < outpamP->width); + pnm_readpamrow(outCellPamP, &tupleRow[outCol]); + + outCol += outCellPamP->width; + } + + assert(outCol = outpamP->width); + + pnm_writepamrow(outpamP, tupleRow); + } + } + pnm_freepamrow(tupleRow); +} + + + +static void +transformNonPbmChunk(struct pam * const inpamP, + struct pam * const outpamP, + struct xformCore const xformCore, + unsigned int const maxRows, + bool const verbose) { /*---------------------------------------------------------------------------- - Produce the transformed output on Standard Output. + Same as transformNonPbmChunk(), except we read 'maxRows' rows of the + input into memory at a time, storing intermediate results in temporary + files, to limit our use of virtual and real memory. Assume input file is positioned to the raster (just after the header). - This can transform any image in any way, but is slower and uses more - memory than the more restricted transformRowByRow() and - transformRowsBottomTop(). + We call the strip of 'maxRows' rows that we read a source cell. We + transform that cell according to 'xformCore' to create to create a + target cell. We store all the target cells in temporary files. + We consider the target cells to be arranged in a column matrix the + same as the source cells within the source image; we transform that + matrix according to 'xformCore'. The resulting cell matrix is the + target image. -----------------------------------------------------------------------------*/ - unsigned int const segmentSize = - optimalSegmentSize(xform, outpamP, availableMemory, pageSize); + unsigned int const inCellFiles = 1; + unsigned int const inCellRanks = (inpamP->height + maxRows - 1) / maxRows; + + struct xformMatrix cellXform; + unsigned int inCellRank; + outputMap * outputMapP; + + if (verbose) + pm_message("Transforming in %u chunks, using temp files", inCellRanks); + + computeXformMatrix(&cellXform, inCellFiles, inCellRanks, xformCore); + + outputMapP = createOutputMap(inpamP, maxRows, cellXform, xformCore); + + for (inCellRank = 0; inCellRank < inCellRanks; ++inCellRank) { + unsigned int const inCellFile = 0; + unsigned int const inCellStartRow = inCellRank * maxRows; + unsigned int const inCellRows = + MIN(inpamP->height - inCellStartRow, maxRows); + + struct pam inCellPam; + struct pam * outCellPamP; + unsigned int outCellFile, outCellRank; + + transformPoint(inCellFile, inCellRank, cellXform, + &outCellFile, &outCellRank); - switch (PNM_FORMAT_TYPE(inpamP->format)) { - case PBM_TYPE: - transformPbm(inpamP, outpamP, xform); - break; - default: - if (segmentSize < outpamP->width) { - if (verbose && xform.b !=0) - pm_message("Transforming %u columns of %u total at a time", - segmentSize, outpamP->width); - else - pm_message("Transforming entire image at once"); - } - transformNonPbm(inpamP, outpamP, xform, segmentSize, verbose); - break; + outCellPamP = &outputMapP->pam[outCellRank][outCellFile]; + + /* Input cell is just the next 'inCellRows' rows of input image */ + inCellPam = *inpamP; + inCellPam.height = inCellRows; + + transformCell(&inCellPam, outCellPamP, xformCore); + } + + rewindCellFiles(outputMapP); + + stitchCellsToOutput(outputMapP, outpamP); + + closeCellFiles(outputMapP); + + destroyOutputMap(outputMapP); +} + + + +static unsigned int +maxRowsThatFit(struct pam * const pamP, + size_t const availableMemory) { + + unsigned long const otherNeeds = 100*1024; + /* A wild guess at how much memory (from the same pool as the + input rows) is needed for things other than the input rows + */ + unsigned long const availForRows = + availableMemory > otherNeeds ? availableMemory - otherNeeds : 0; + unsigned int const bytesPerTuple = + pamP->depth * sizeof(sample) + sizeof(tuple *); + unsigned int const bytesPerRow = + pamP->width * bytesPerTuple + sizeof(tuple **); + + unsigned long const maxRows = availForRows / bytesPerRow; + + if (maxRows < 1) + pm_error("You haven't allowed enough memory to fit even one row " + "of the source image in memory. The minimum chunk size " + "is one row; we need at least %lu bytes.", + otherNeeds + bytesPerRow); + + return (unsigned int)MIN(maxRows, UINT_MAX); +} + + + +static void +transformPbm(struct pam * const inpamP, + struct pam * const outpamP, + struct xformCore const xform, + bool const verbose) { + + if (xform.b == 0 && xform.c == 0) { + /* Rows of input map to rows of target; no column-for-row */ + if (xform.d == 1) + /* Row N of the output is based only on Row N of the + input, so we can transform row by row and avoid + in-memory buffering altogether. + */ + transformRowByRowPbm(inpamP, outpamP, xform.a == -1); + else + /* Row N of the output is based only on Row ~N of the + input. We need all the rows in memory, but have to pass + through them only twice, so there is no page thrashing concern. + */ + transformRowsBottomTopPbm(inpamP, outpamP, xform.a == -1); + } else + /* This is a column-for-row type of transformation, which requires + complex traversal of an in-memory image. + */ + transformPbmGen(inpamP, outpamP, xform); +} + + + +static void +transformNonPbm(struct pam * const inpamP, + struct pam * const outpamP, + struct xformCore const xform, + unsigned int const availableMemory, + bool const verbose) { + + if (xform.b == 0 && xform.c == 0 && xform.d == 1) { + /* Row N of the output is based only on Row N of the + input, so we can transform row by row and avoid + in-memory buffering altogether. + */ + if (verbose) + pm_message("Transforming row by row"); + transformRowByRowNonPbm(inpamP, outpamP, xform.a == -1); + } else { + unsigned int const maxRows = maxRowsThatFit(inpamP, availableMemory); + if (maxRows >= inpamP->height) { + /* We can fit the whole image in memory at once and avoid + temporary files. + */ + if (xform.b == 0 && xform.c == 0 && xform.d == -1 && + xform.a == 1) { + /* This is just a vertical flip; We can move whole rows + instead of individual pixels and save time. + */ + if (verbose) + pm_message("Transforming whole rows, all in memory"); + + transformRowsBottomTopNonPbm(inpamP, outpamP); + } else { + if (verbose) + pm_message("Transforming whole image at once, " + "pixel by pixel"); + transformNonPbmWhole(inpamP, outpamP, xform, verbose); + } + } else + /* No optimizations possible */ + transformNonPbmChunk(inpamP, outpamP, xform, maxRows, verbose); } } @@ -894,8 +1139,8 @@ main(int argc, char * argv[]) { struct cmdlineInfo cmdline; struct pam inpam; struct pam outpam; + unsigned int cols, rows; FILE * ifP; - struct xformMatrix xform; pnm_init(&argc, argv); @@ -908,36 +1153,22 @@ main(int argc, char * argv[]) { pnm_readpaminit(ifP, &inpam, PAM_STRUCT_SIZE(tuple_type)); - computeXformMatrix(&xform, inpam.width, inpam.height, - cmdline.xformCount, cmdline.xformList); - outpam = inpam; /* initial value */ outpam.file = stdout; - outpam.width = abs(xform.a * inpam.width + xform.c * inpam.height); - outpam.height = abs(xform.b * inpam.width + xform.d * inpam.height); + xformDimensions(cmdline.xform, inpam.width, inpam.height, &cols, &rows); + outpam.width = cols; outpam.height = rows; - if (xform.b == 0 && xform.d == 1 && xform.f == 0) - /* In this case Row N of the output is based only on Row N of - the input, so we can transform row by row and avoid - in-memory buffering altogether. - */ - transformRowByRow(&inpam, &outpam, xform.a == -1, cmdline.verbose); - else if (xform.b == 0 && xform.c == 0) - /* In this case, Row N of the output is based only on Row ~N of the - input. We need all the rows in memory, but have to pass - through them only twice, so there is no page thrashing concern. - */ - transformRowsBottomTop(&inpam, &outpam, xform.a == -1, - cmdline.verbose); - else - /* This is a column-for-row type of transformation, which requires - complex traversal of an in-memory image. - */ - transformGen(&inpam, &outpam, xform, - cmdline.availableMemory, cmdline.pageSize, - cmdline.verbose); + pnm_writepaminit(&outpam); + switch (PNM_FORMAT_TYPE(inpam.format)) { + case PBM_TYPE: + transformPbm(&inpam, &outpam, cmdline.xform, cmdline.verbose); + break; + default: + transformNonPbm(&inpam, &outpam, cmdline.xform, + cmdline.availableMemory, cmdline.verbose); + } pm_close(inpam.file); pm_close(outpam.file); |