Very Fast Algorithms for Eliminating the Diffraction Effects in Protein-Based Volumetric Memories

Received 12 December 2014; Accepted 17 February 2015; Publication 31 March 2015

Dragos Trinca1 and Sanguthevar Rajasekaran2



Abstract

One of the current research directions in biological nanotechnology is the use of bacteriorhodopsin in the fabrication of protein-based volumetric memories. Bacteriorhodopsin, with its unique light-activated photocycle, nanoscale size, cyclicity (>107), and natural resistance to harsh environmental conditions, provides for protein-based volumetric memories that have a comparative advantage over magnetic and optical data storage devices. The construction of protein-based volumetric memories has been, however, severely limited by fundamental issues that exist with such devices, such as unwanted diffraction effects. In this paper, we propose some optimizations that can be applied to one of the previously proposed algorithms for eliminating the diffraction effects.



Keywords


1 Introduction

Much of the current research effort in biological nanotechnology is directed toward self-assembled monolayers and thin films, biosensors, and protein-based photonic devices [16]. Although a number of proteins have been explored for device applications, bacteriorhodopsin [3] has received, it seems, the most attention. Bacteriorhodopsin protein memory devices exhibit increased thermal, chemical and photochromic stability, and have the advantage of being radiation-hardened, waterproof, and EMP-resistant. Such devices are capable of storing large amounts of data in a small volume.

Although there are a number of prototype systems and preliminary effort to apply them, the potential of this promising technology is relatively unexplored. Research on protein-based memories started in the late 1980s with considerable anticipation, but enthusiasm decreased shortly afterwards for several reasons. Commercial development of spatial light modulators (SLMs), that are an integral component of protein-based memories, was slow and there were several issues, such as unwanted diffraction effects, that limited performance in three-dimensional memory applications. More recently, however, the development of high-definition television projection equipment has resulted in the commercial availability of high-resolution, high-performance and relatively inexpensive SLMs. But, fundamental problems have remained. Two such problems are diffraction effects and scaling.

1.1 Diffraction Effects

The branched-photocycle three-dimensional memory, as discussed in [2, 7], stores data by using a sequential two-photon process to convert bacteriorhodopsin (bR) in the activated region from the bR resting state to the Q state. This process involves using a paging beam to select a thin page of memory and a writing beam that is pixilated at those positions where data are to be written. The transition from the bR resting state to the Q state occurs via the intermediate states K, L, M, N, and O. Only the bR (bit 0) and Q (bit 1) states are stable for long periods of time. By using 32-level grey-scaling and two polarizations, each voxel can store 64 bits. Attempts to use higher levels of grey-scaling have failed, mainly due to the problems of diffraction introduced by having pages with significant differences in the average refractive indices. To understand this problem, we note that protein corresponding to bit 0 has a high refractive index and protein corresponding to bit 1 has a low refractive index with reference to the red laser beam at 633 nm that is used to read out the data. Prototypes made of the three-dimensional memory fail to work at high storage densities when individual pages of memory have a preponderance of bits of a given state. Consider the worst case – each page is either all 0’s or all 1’s. Then, a refractive index grating is created, that diffracts the laser beams quite efficiently because individual pages are stored with separations of 6-20 µm. While these separations are significantly larger than the diffraction limit would dictate, closer spacing is not possible, due to beam steering inside the data cuvettes. The unwanted beam steering is due to refractive index gradients. Algorithms that can store data at high resolution and that maintain the number of 0’s equal to the number of 1’s would solve this problem [712].

2 Optimizing the APPROXv3 Algorithm

One way of ensuring an equal number of 0’s and 1’s is to replace a 0 with 01 and a 1 with 10. However, this procedure would reduce the available memory by a factor of 2 (which means that the utility factor would be 50% in this case). In [7] and [9] the authors have proposed some methods that provide utility factors of more than 50%. For completely random data, however, the best utility factor provided by the previously proposed algorithms is about 99.9%, and can be obtained by using the APPROXv1 algorithm proposed in [9], which works as follows. Let I be the input data, and bI the binary representation of I. If I is of length L, then bI is of length 8L, as each byte of data is represented binary on 8 bits. Scan bI and count the number of 0’s and 1’s. Suppose that bI doesn’t have an equal number of 0’s and 1’s. Scan bI from left to right, bit by bit. For each position i in the scanning, 1 ≤ i ≤ 8L, do as follows. Let bI1:i be the portion already processed, and bIi+1:8L the portion that has remained to be processed. Let bI1:i¯ be the complement of bI1:i. If the binary string bI1:i¯bIi+1:8L has an equal number of 0’s and 1’s, then stop the scanning, let q be this first position i that leads to a bI1:i¯bIi+1:8L with an equal number of 0’s and 1’s, and the output of APPROXv1 is xbI1:q¯bIq+1:8L, where x is a binary string of length 256 such that the first 120 bits of x store q, the next 3 bits store the version of the algorithm (in this case 001), the next 5 bits are 00000 (not used in APPROXv1) and the remaining 128 bits are used to make x a binary string with an equal number of 0’s and 1’s. So, in conclusion, the length of the output is 8L+256. (There is no particular reason for choosing the length of x to be 256, just that this value seems to be sufficient in practice.)

In [11], some optimizations that can be applied to APPROXv1 have been proposed. Precisely, the optimizations proposed in [11] apply to the second scanning step (when the position q is found), and work as follows. Suppose that bI has more 0’s than 1’s. Let NofZs be the number of 0’s in bI, and NofOs the number of 1’s in bI. Observe that every position i that leads to a bI1:i¯bIi+1:8L with an equal number of 0’s and 1’s has the property that bI1:i has exactly (i-((NofZsNofOs)/2))/2+(NofZsNofOs)/2 0’s and (i – ((NofZsNofOs)/2))/2 1’s. So, during the scanning, we can skip those positions that for sure don’t have this property, and examine only those positions that may have this property, until the desired position is found. Precisely, the modifications are the following.

This is the APPROXv2 algorithm recently proposed in [11]. The output in APPROXv2 is xbI1:q¯bIq+1:8L, where the first 120 bits of x store q, the next 3 bits are 010 (the version of the algorithm, which is 2), the next 5 bits are 00000 (also, not used in APPROXv2), and the remaining 128 bits are used to make x a binary string with an equal number of 0’s and 1’s, as usual.

Example 1 Suppose that

bI=01011010110000011010000100001111.

bI has 18 0’s and 14 1’s. APPROXv1 works as follows.

Example 2 For the same bI, APPROXv2 works as follows.

APPROXv2 can be optimized as follows. Assume that, besides the input data, we have a parameter sl, such that 1 ≤ sl ≤ 8L, and such that 8L is a multiple of sl. Then, we consider the following substrings of bI:

bI1:sl,bIsl+1:2sl,...,bI8Lsl+1:8L.

So, we are splitting bI into 8Lsl substrings, each of length sl. For each bIi·slsl+1:i·sl,1i8Lsl, consider the pairs (ai, bi) and (ci, di), where

ai=Zs[i·slsl+1](i·slsl+1Zs[i·slsl+1]),bi=Zs[i·sl](i·slZs[i·sl]),ci=(i·slsl+1Zs[i·slsl+1])Zs[i·slsl+1],di=(i·slZs[i·sl])Zs[i·sl].

So, for each bIi·slsl+1:i·sl,ai and sbi are the differences between the number of 0’s and 1’s at the two ends, while ci and di are the differences between the number of 1’s and 0’s at the two ends. APPROXv1 and APPROXv2 start with a scanning of bI where we count the number of 0’s and 1’s. The pairs (ai, bi), (ci, di) can be computed during this first scanning. We have the following [12]

Fact 1 If NofZs > NofOs, then there exists at least one i, 1i8Lsl, such that

ai(NofZsNofOs)/2bi.

Otherwise, if NofOs > NofZs, then there exists at least one i, 1i8Lsl, such that

ci(NofOsNofZs)/2di.

Given this, APPROXv2 can be optimized as follows. Suppose that NofZs > NofOs. Let j be the first index that satisfies aj(NofZsNofOs)/2bj. Then, we can search for the position q only in the substring bIj·slsl+1:j·sl, because, at this point, we know that the position q we are looking for satisfies j·slsl+1qj·sl. This is the APPROXv3 algorithm, proposed in [12]. The output in APPROXv3 is as in APPROXv2, the only difference being x121:123 = 011 (the version of the algorithm, which is 3).

Example 3 For the same bI, and assumming sl = 8, APPROXv3 works as follows.

Some results when comparing APPROXv1 with APPROXv2 are given in Table 1. APPROXv3 takes less than one millisecond, in general.

Table 1 Experimental results

images

In this paper, we propose some optimizations that can be applied to APPROXv3, and that could lead to faster practical algorithms for eliminating the diffraction effects. Suppose that NofZs > NofOs, and let j be the first index that satisfies

aj(NofZsNofOs)/2bj.

Then, we search for the position q in the substring bIj·slsl+1:j·sl, and we find it.

If q8L2, then the output is xbI1:q¯bIq+1:8L, where x is a binary string of length 256 such that the first 120 bits of x store q, the next 3 bits are 100, the next 5 bits are 00000, and the remaining 128 bits are used to make x a binary string with an equal number of 0’s and 1’s. If q(8L2+1), then the output is xbI1:qbIq+1:8L¯, where x is a binary string of length 256 such that the first 120 bits of x store q, the next 3 bits are 100, the next 5 bits are 00000, and the remaining 128 bits are used to make x a binary string with an equal number of 0’s and 1’s.

Denote this version of APPROXv3 by APPROXv4. Clearly, in the case q8L2, APPROXv4 is basically APPROXv3. But, when q(8L2+1), we compute in the output of APPROXv4 the complement of bIq+1:8L instead of the complement of bI1:q, the reason being that in this case the length of bIq+1:8L is strictly smaller than the length of bI1:q. In practice, in the case q(8L2+1), this means that the larger the difference q8L2 is, the faster APPROXv4 is as compared to APPROXv3.

Example 4 Suppose that

bI=01011010011010010110000101100010.

bI has 18 0’s and 14 1’s. Assumming sl = 8, APPROXv4 works as follows.

APPROXv4 can be further optimized as follows. Suppose that NofZs > NofOs, and let j be the first index that satisfies aj(NofZsNofOs)/2bj.

Then, we search for the position q in the substring bIj·slsl+1:j·sl, and find it.

If q8L2, then the output is xbI1:q¯bIq+1:8L, where x is a binary string of length 256 such that the first 120 bits of x store q, the next 3 bits are 101, the next 5 bits are 00000, and the remaining 128 bits are used to make x a binary string with an equal number of 0’s and 1’s. If q(8L2+1), then consider the input rI, where rIi = bI8L-i+1, that is, rI is the reverse of bI. For rI, we know that NofZs > NofOs. From the pairs (ai, bi) corresponding to bI, we consider the corresponding pairs for rI, say (ai',bi'). Let j′ be the first index that satisfies

aj''(NofZsNofOs)/2bj''.

Then, we search for the position q′ in the substring rIj·slsl+1:j·sl, and find it. That is, q′ is the first position i in rI such that rI1:i¯rIi+1:8L has an equal number of 0’s and 1’s. Then, the output is xrI1:q¯rIq+1:8L, where x is a binary string of length 256 such that the first 120 bits of x store q′, the next 3 bits are 101, the next 5 bits are 10000, and the remaining 128 bits are used to make x a binary string with an equal number of 0’s and 1’s. So, in the case q(8L2+1), the first 120 bits of x store the position q′ instead of q and since q<8L2 then the 124-th bit of x, which is 1, will tell us, at decompression, that at compression the input rI has been considered instead of bI.

Denote this version of APPROXv4 by APPROXv5. In the case q8L2, APPROXv5 is basically APPROXv4. When q(8L2+1), then in APPROXv5 we consider the input rI instead of bI. In practice, in the case q(8L2+1), if q′ is significantly smaller than 8Lq (the length of bIq+1:8L¯) then complementing rI1:q' instead of bIq+1:8L could lead to a significant speed-up.

We note that, in the case q(8L2+1), APPROXv5 does additional work as compared to APPROXv4, by finding the position q′. However, this additional work would be negligible in general. In practice, if the decompression time is more important than the compression time, that is, if the time taken when decompressing the output xrI1:q¯rIq+1:8L is more important than the time taken when constructing it, then if q′ is significantly smaller than 8Lq, this means that decomplementing rI1:q¯ instead of bIq+1:8L¯ would take signifincatly less time at decompression.

Example 5 Suppose that

bI=01011010011010010110000101100010,

as before. bI has 18 0’s and 14 1’s. Assumming sl = 8, APPROXv5 works as follows.

Similarly as in the case of APPROXv3, for inputs bI of length a few millions bits, both APPROXv4 and APPROXv5 take in our implementations around one second.

3 Discussion and Conclusion

The algorithms presented are indeed very fast in practice, which should provide for a real-time and fast working of protein-based memories. This conclusion comes from the fact that, for example, APPROXv5 takes around one second for an input of a few millions bits, so for a protein-based memory page of one thousand bits in size, APPROXv5 would normally take around one millisecond, which is quite fast. We anticipate that the processor latency caused by the execution of one of these algorithms for the correct compression (when writing a page in the protein-based memory) or the correct decompression (when reading a page from the protein-based memory) would be negligible, especially with the current technology, which means that in practice the protein-based memories should be quite fast.

References

[1] R. R. Birge, Photophysics and molecular electronic applications of the rhodopsins, Annual Review of Physical Chemistry, vol. 41, pp. 683–733 (1990)

[2] R. R. Birge, N. B. Gillespie, E. W. Izaguirre, A. Kusnetzow, A. F. Lawrence, D. Singh, Q. W. Song, E. Schmidt, J. A. Stuart, S. Seetharaman and K. J. Wise, Biomolecular electronics: protein-based associative processors and volumetric memories, Journal of Physical Chemistry B, vol. 103, pp. 10746–10766 (1999)

[3] N. A. Hampp, Bacteriorhodopsin: mutating a biomaterial into an optoelectronic material, Applied Microbiology and Biotechnology, vol. 53, pp. 633–639 (2000)

[4] J. R. Hillebrecht, J. F. Koscielecki, K. J. Wise, D. L. Marcy, W. Tetley, R. Rangarajan, J. Sullivan, M. Brideau, M. P. Krebs, J. A. Stuart and R. R. Birge, Optimization of protein-based volumetric optical memories and associative processors by using directed evolution, NanoBiotechnology, vol. 1, pp. 141–151 (2005)

[5] D. Marcy, W. Tetley, J. Stuart and R. Birge, Three-dimensional data storage using the photochromic protein bacteriorhodopsin, in Proc. of the 22nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC'00), pp. 1003–1006

[6] D. Oesterhelt, C. Bräuchle and N. Hampp, Bacteriorhodopsin: a biological material for information processing, Quarterly Reviews of Biophysics, vol. 24, pp. 425–478 (1991)

[7] S. Rajasekaran, V. Kumar, S. Sahni and R. Birge, Efficient algorithms for protein-based associative processors and volumetric memories, in Proc. of IEEE NANO 2008, pp. 397–400

[8] S. Rajasekaran, V. Kundeti, R. Birge, V. Kumar and S. Sahni, Efficient algorithms for computing with protein-based volumetric memory processors, IEEE Transactions on Nanotechnology, vol. 10, pp. 881–890 (2010)

[9] D. Trincă and S. Rajasekaran, Coping with diffraction effects in protein-based computing through a specialized approximation algorithm with constant overhead, in Proc. of IEEE NANO 2010, pp. 802–805

[10] D. Trincă and S. Rajasekaran, Specialized compression for coping with diffraction effects in protein-based volumetric memories: solving some challenging instances, Journal of Nanoelectronics and Optoelectronics, vol. 5, pp. 290–294 (2010)

[11] D. Trincă and S. Rajasekaran, Optimizing the APPROXv1 algorithm for coping with diffraction effects in protein-based volumetric memories, in Technical Summaries of SPIE Optical Systems Design, abstract 8167A-71 (at page 14), 2011

[12] D. Trincă and S. Rajasekaran, Fast Algorithms for Coping with Diffraction Effects in Protein-based Volumetric Memories (Design and Implementation), Journal of Computational and Theoretical Nanoscience, vol. 10, pp. 894–897 (2013)

Biographies

D. Trinca received his BSc degree in 2003, from Facultatea de Informatica, Universitatea Alexandru Ioan Cuza din Iasi, Romania, and his PhD degree in 2008, from the Department of Computer Science and Engineering, University of Connecticut, USA. Currently he is with Sc Piretus Prod Srl in Osoi, jud. iasi, Romania. His research interests are in the area of computational problems in nanotechnology.

S. Rajasekaran received his BSc degree in Physics in 1977, from Madurai Kamaraj University, India, and his PhD degree in Computer Science in 1988, from Harvard University, USA. Currently he is with University of Connecticut, USA. His research interests are in the area of applied algorithms.

Abstract

Keywords

1 Introduction

1.1 Diffraction Effects

2 Optimizing the APPROXv3 Algorithm

images

3 Discussion and Conclusion

References

Biographies