A-Z of e-Discovery Terms (M-Z)

Go to: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
   
Metadata Data about data. All electronic documents contain other information about the document that may not appear on the face of the document. The extent and nature of metadata depends on the type of electronic file. For instance almost all files will have a ‘creation date’ and a ‘last modified’ date. Certain files such as photographs taken on digital cameras capture dozens of items of metadata such as the make and model of camera used, when the photo was taken and even the camera settings and whether the flash went off when the photo was taken. For emails metadata includes the sender, date and time sent, recipients (and whether they were sent direct, cc’d or bcc’d) as well as extensive additional information about how the email was transmitted. The usefulness of metadata fields varies from file type to file type. For instance, the ‘author’ of an email will correspond to the email account from which the email was sent (although it does not necessarily follow that the true author was in fact the same as the sender given that someone else could access and send an email from an email account) whereas the ‘author’ metadata for Word documents and Excel spreadsheets may be misleading, unhelpful or even blank. Scanned paper files (normally in TIFF or PDF format) normally have minimal metadata other than the date of creation (which in turn often bears little resemblance to any date information appearing on the face of the document).Millnet’s Smart e-Discovery processing services will normally yield between 400 and 600 fields of metadata from electronic documents. Many are irrelevant for the purposes of the legal review and disclosure process and as such only the most useful and common fields are normally including in the review database.

It is best practice to disclose certain fields of metadata along with the documents to which they relate. At a minimum, the metadata fields to disclose (to the extent they exist for a particular document) are author, document date, document type, recipient(s), document name or subject (for emails).

«top

Multi-user Refers to the concept of more than one person being able to access the same documents (i.e. database) at the same time. Whilst it is not possible for two or more people simultaneously to perform certain actions on the same document at the same time (for example attempting to redact or tag the same document at precisely the same moment) in practice it is rare for users accessing a hosted database system to notice whether or not other users are working in the same database at the same time. Multi-user access is the key to what is often referred to as ‘collaboration’. The advent of multi-user access has led to features that enable Millnet or an appropriately trained administrator to control who can see what (documents, folders, tags, notes, searches etc) within a particular database. This makes it increasingly feasible for groups of users from different organisations to access and collaborate within a single database (e.g. the law firm(s), counsel, experts and / or client(s)).«top
Native
file / format
Refers to the file format of an electronic document relating to the software application used to create the document. For example, the native file format for a spreadsheet is most often Microsoft Excel or even more specifically an “.XLS” file which is the file extension that identifies to which software application the file relates. It is increasingly common to review, disclose and even present at trial certain types of documents in their native format. For instance it is best practice to review, disclose and present spreadsheets (the most common type of which is Microsoft Excel) and increasingly other files types such as Microsoft PowerPoint presentations. The advantage of native files is that they will normally contain metadata (although it may be modified and it is possible to delete metadata) and it is common for such files to contain information that is not visible when printed to paper or image format (i.e. TIFF or PDF). Disclosure of native spreadsheet files may carry some risk as they may have been subjected to a less rigorous review owing to the possibility of ‘hidden’ information; however, it is best practice to do so as disclosure of such files in paper or image format is far less helpful.«top
Navigation Refers to the means by which users can move from one document to the next in a database and more importantly whether they can move from one place in a collection of documents to another with the ability to return to an earlier point. Most hosted database systems enable users to quickly and easily move or ‘navigate’ one document at a time through a chronologically sorted list of emails, for example. However, the best hosted databases also enable a degree of ‘skipping’ from one point to another to follow a line of enquiry (for example to view similar documents, documents that were attached or email threads). This ability to ‘skip’ from one point to another efficiently and return to a previous point can save considerable time and therefore cost. We often refer to the concepts of ‘linear’navigation (i.e. one document after another) versus ‘lateral’ navigation (i.e. diverging from a linear review to consider other documents).«top
Near-duplicate Near-duplicate documents contain a high proportion of the same textual content. Understanding when / why near-duplicates arise in collections of documents can greatly assist with considering the most efficient approach to review and disclosure. Common sources of near-duplication include:

  • Standard or template documents such as letters and agreements where there are only minor differences such as the addressee details between versions.
  • Standard reports such as project progress reports; often these are Excel spreadsheets.
  • Documents or presentations that have been subjected to various rounds of review and amendments in draft form where each round of review involved emailing a group of people for comments.
  • Documents emailed to a wide distribution list and saved by multiple recipients with only minor changes to the original file (such as changing the name of the file).
  • Emails that appear in long chains created by way of the process of forwarding and / or replying over and over to an initial root email.
  • Documents that have been created from a template or common root source document. For example, many presentations or proposal documents are created by altering an existing similar document.
  • Documents that have been converted from one format to another. For example, a Word document that has been converted to pdf and then emailed to multiple people.
  • A printed copy of an electronic document such as an email or letter that has been scanned to create a new electronic file.

Dealing efficiently with near-duplicate documents can significantly reduce the time and cost associated with review and disclosure and therefore impacts what may or may not be proportionate for a particular matter. Millnet provides a near-duplicate and email thread analysis service that facilitates these review efficiencies. The service incorporates not only the software to process data but also the consultative advice around how to make the most of this information and how to weigh up the costs versus benefits. The use of near-duplicate and email thread detection technology must also be properly integrated into the workflow for a particular matter so as to maximise the benefits.

«top

Non
searchable image file
Refer to image files above. The existence of non searchable image files contained within a collection of electronic documents may have an important bearing on the scope of keyword or other search techniques because the text within such files is not searchable until they have been subject to Optical Character Recognition ‘ OCR’ processing.«top
Objective Coding

Many clients have dealt with processed electronic data and will not have difficulties using the large amount of information that can be extracted and appreciate the options it provides for reviewers. 

Dealing with paper data on the other hand will be very limited on extracted information, as the only really useful information will be found within the extracted text, and it will not be 100% accurate, as the quality of the original documents will affect the OCR results. 

The one essential step to gaining control over scanned data is Objective Coding.

Our teams of skilled operators scan the documents into electronic format, capture the relationships between documents and will code information from the documents into multiple fields. Adding the human factor into the document processing transforms plain images into a library of useful information that can be searched, sorted, grouped by and prioritized. Considering the short timeframe usually associated with disclosure, the ability to access all the important information in a matter of moments gives clients the advantage over time and resources constraints. 

In today’s explosion of electronic data, we should not forget that every technology has its limitations and that human input is still needed and valued. «top 

Optical
Character Recognition ‘OCR’
OCR is the automated process whereby software examines the data within an electronic image file and creates a file containing text to the extent that the software can discern that the data within the image file constitutes letters and other characters.OCR accuracy has improved significantly over the past 10 years or so. It is now normal to expect 99%+ accuracy when converting a page of textual data. However, you should note that most OCR software examines each character in turn. Whilst a 1% error rate may be relatively small, if there are, say, an average of 5 characters per word, then, if the OCR errors are spread evenly across the words, there would be as much as a 5% error rate for the words on a page (i.e. where there is a single incorrectly recognised character thereby altering the spelling of a particular word). Fuzzy searching is a feature that can be used to overcome OCR errors by compensating for one or more incorrect character in the word(s) being searched for.

One of the key determinants of OCR accuracy is the quality of the original image file. For scanned paper documents it is important to scan every page as carefully as possible and to avoid ‘skewing’ pages and / or introducing any ‘specks’ or ‘marks’ into the image file during the scanning process.

Another consideration when undertaking OCR is the potential for foreign languages. Foreign languages including those based on pictorial characters such as Russian (Cyrillic), Arabic, Japanese, Chinese etc can be OCR’d, but results may vary widely. A mixture of different languages in the same file and especially on the same page will generally cause OCR software to correctly interpret only- the characters for one language.

Another factor to consider with OCR is whether OCR files are to be disclosed. It is normal practice to do so (to the extent that OCR files were created in the first place). If documents have been redacted prior to disclosure, it is critical to ensure that OCR files of the pre-redacted version of a file are not disclosed (this is a remarkably common mistake).

«top

Offline Refers to not being connected to the internet. One limitation of many hosted review services is that users must be connected to the internet in order to access the system. There will be circumstances when an internet connection is not available and / or not sufficiently fast or reliable (e.g. in court or when travelling) and as such there may be a desire to have an ‘offline’ copy of the review database. There are three main limitations / considerations in this regard:1. Most of the hosted (online) review systems do not have a version for offline use or to the extent that they have, it is often problematic and costly to set up an offline version.

2. As soon as data is moved offline it is no longer synchronised with the online version. Some offline versions have the ability to synchronise between an offline and online version. However, this is often problematic and can add considerably to costs. For instance, where notes are made, redactions applied or documents tagged for the same documents in an online and an offline version of the database, the potential for conflict between the two versions arises, in which case which changes take priority?

3. Security is potentially a major concern for an offline version of a database. Whilst it is normal for data transmitted over the internet to be encrypted, it is not normal for the data residing in an offline database to be encrypted. Given that most offline database requirements are for mobility and therefore on a laptop, there is a significant risk of loss / theft of the laptop and for most laptops it is relatively easy to access data even when password protected.

«top

Pagination Used both as a noun and a verb (“paginate”) to describe the numbering (or the process of applying numbering) on documents that have been converted to image file format (normally TIFF or PDF format). «top
PDF Stands for Portable Document Format. This is a format created by Adobe and association with the Acrobat software. Adobe originally created the PDF format as the basis to be able to share documents between people without the need for the software application used to create the original file. For instance, if someone created a letter in Microsoft Word version 2007 by converting the document to PDF format any recipient of the document could open the file using the free Adobe Acrobat reader program, regardless of whether or not they have Word version 2007.The success of the PDF has seen this file format become important in other ways. PDF is one of the most common formats for scanning of paper documents (the other being TIFF) and also more and more organisations opt to convert documents to PDF to make them ‘read only’ and to strip metadata that would otherwise accompany say a Word file.

PDF files may be searchable (i.e. contain text embedded within the file) or non searchable. In order to make non searchable PDF files searchable they must be subjected to OCR processing. This is an important consideration when applying keyword searches to a collection of electronic documents as, despite appearing to be textual documents, the text within non searchable PDFs will of course not be searched.

Other considerations relating to PDF files include the tendency of some law firms to provide electronic disclosure in the form of very large multi-page pdf files typically where each lever arch file or even box of paper documents has been scanned to a single file. Refer to the section on unitisation below for more information on this topic.

«top

Produce
/ production
Refers to outputting documents in electronic and / or hard copy format from the hosted review database. The two main production events are on disclosure and for bundle preparation. However, it may be necessary to produce sets of documents at various stages where there is a requirement to provide documents to individual(s) who cannot access the documents online. This could include, for example, someone who has a preference for working with paper documents, experts who are not given access to the hosted service and when there is a requirement to have access to documents where there is no internet connection (e.g. when travelling, attending a hearing or other meeting etc).When producing documents, consideration should always be given as to the audit trail of what was produced, when and by whom? Also, to the extent documents are produced in hard copy or electronic format it is important to consider how such documents will be used and what information may need to included with the documents in order to track and potentially reconcile back to the hosted database. For instance, documents should always include a unique identifying number ‘printed’ (when in paper or image form) on the documents. Ideally the unique reference should appear on all pages (i.e. bates numbering) and consideration should be given as to whether it is appropriate to use the same numbering throughout a matter or, for instance, to adopt a different numbering scheme to disguise the existence of any ‘gaps’ in the original numbering sequence where documents have been excluded on the basis of relevance or privilege.

«top

Quango Any quasi-autonomous non-government organisation. Much detested by Charles Holloway, author of the Smart e-Discovery blog. Usually appears in conjunction with ‘bloated’ (adj.)«top
Reconstruction The process of re-assembling a file or bundle of documents following deconstruction and scanning. Refer to the definition of deconstruction above for more detail.«top
Reformat(ting) Refer to the section on format / formatting above.«top
Relevance
(Equivio)
Millnet licenses software from Equivio that is used to analyse near-duplicates and email threads. The latest software from Equivio extends the concept of text based analysis to what they term as ‘relevance’. Equivio Relevance involves an experienced lawyer iteratively reviewing small batches of documents and tagging them according to relevance and / or issues. The Equivio software analyses the textual content of the reviewed documents and uses statistical techniques to compare these to the entire collection so as to suggest how all other documents should be classified. Whilst it is not suggested that this replaces a normal legal review, it is a means by which to prioritise review efforts and to provide an additional level of quality control. This type of technique could also be employed where for instance there is a large volume of documents for review on a matter for which it would be disproportionate to undertake a systematic legal review and there are effective means of ensuring privileged documents will not be disclosed.The concept of Equivio Relevance is similar to that of conceptual search and other advanced technologies that attempt to accelerate / automate what would otherwise be a manual review process.

«top

Securely
wipe (“shred”)
The process of deleting files via, for instance, Windows Explorer does not actually delete the selected files. In the first instance, the files will normally appear in the ‘recycle bin’ which must be ‘emptied’ in order for the file to disappear from view entirely. The concept of the ‘recycle bin’ is intended to safeguard against inadvertent deletion of files.However, even when the recycle bin is emptied the files still exist and can be retrieved using forensic software. Therefore, when it is important to ensure that electronic data is properly destroyed (i.e. the equivalent to secure shredding of paper) then Millnet use specialist software that firstly deletes all selected data (or more often the entire contents of a hard drive) and then overwrites the hard drive or space where the documents previously resided in such a way that it will never be possible for a forensic examiner to retrieve any data or even fragments thereof from the electronic media.

This process of ‘securely wiping’ hard drives is often performed on highly confidential matters after Millnet has delivered data to a client and the client has confirmed safe receipt. The process is relatively time consuming and hence is normally a chargeable service.

«top

Server A server is a computer that is providing one or more ‘services’ to multiple users. It is common for a server to be dedicated to a particular function such as acting as the central store and control for emails (i.e. an email server) or for storing files shared across a network (a file server).Server computers have all the basic elements of normal desktop or laptop computers but are generally more powerful, more fault tolerant and more scalable. The key features of a server are the processor chips, the memory (RAM) and the number, size and configuration of disk drives. Typically, many servers are required to provide even a relatively simple hosted review database service.

Servers are also often the main central store of ‘live’ (i.e. not archived) data and will be primary targets when a forensic collection is appropriate. The main targets will typically be the email server(s) and file server(s). File servers are often referred to by IT people by Three Letter Acronyms ‘TLA’s’. These include so called NAS boxes – i.e. Network Attached Storage, DAS – Direct Attached Storage and SAN – Storage Area Network (strictly speaking a SAN is not a server as such and may in fact comprise many servers and other devices but in practice the SAN is often one large server with many hard drives).

«top

Systems
files
Systems files are electronic files that are created by a computer (i.e. not the user of the computer) or are part of the software required to run applications such as Microsoft Windows and Office and will not contain any text or other information of relevance to the review and disclosure process. When electronic documents are collected via a process of forensic imaging one of the first steps in culling data will often be the removal of systems files via a processed often referred to as deNISTing (see above). «top
Tab Refer to lateral navigation above.«top
Tag /
Tagging
The electronic equivalent to coloured paper tags used to flag particular documents in a lever arch file. Tagging is the process of assigning classifications (such as relevance, privilege or by issues) to one or more documents. The process of creating a logical set of tags and then systematically assigning documents to these tags is one of the fundamental aspects of workflow design when using a hosted review database.«top
Textual duplicate Two documents that contain identical text. To the extent these are two electronic files then they may or may not be duplicate files depending whether they have virtually identical metadata.Millnet apply a textual comparison process that groups documents together according to the percentage similarity of their text. Textual duplicates are files that are identical in textual content but not MD5 duplicates (i.e. exact file duplicates) owing to differences in metadata. Identifying textual duplicates as part of the process of near-duplicate analysis (refer above) will assist to identify documents that most lawyers would consider to be duplicate documents but would otherwise remain in the collection for review following de-duplication. The most common instances where textual duplicates will arise include:

  • When the same electronic document is received as an attachment to an email and then saved by multiple recipients with a range of different file names without having changed any other aspect of the document itself.
  • When a text based document (typically Microsoft Word) is converted into a different format (most commonly PDF) for transmission via email or storing for instance in a document management system.

See also near-duplicate, MD5 and de-duplication for more on the topic of duplicates.

«top

TIFF /
TIF
The Tagged Image File Format or TIFF is a file format for storing bitmap images. By ‘bitmap image’ we mean the electronic equivalent of a photograph or photocopy representation of either a scanned hard copy file or another file type converted into TIFF format.TIFF images (file extension of .tiff or .tif) are the most common image file format for scanned hard copy documents and when disclosing documents such as emails in an electronic format other than the original (i.e. ‘native’) file format.

«top

Two
Factor Authentication
  • Most online or ‘hosted’ review databases are accessed via a username and password allocated to a specific user. Assuming the password used is relatively ‘strong’ (i.e. not easy to guess and sufficiently long and complex that it could not be guessed by using a computer program to randomly apply all the possible permutations and combinations of characters) then this may be deemed sufficiently secure. There are, however, various risks associated with relying on a single ‘factor’ of authenticating (i.e. the act of confirming someone’s identity and right to access the system). Risks include:
  • Someone using software to ‘guess’ the password as mentioned above.
  • It is possible to attach a device to desktop, laptop or even network which gathers password information or even just keystrokes when typing on the keyboard.
  • Many users have a tendency to use simple / memorable passwords which in turn can be easily guessed.
  • When using another computer at home or in an internet café it is possible that passwords are ‘cached’ (i.e. stored on that computer) so that subsequent users may be able to gain unauthorised access.
  • In order to improve security the concept of a second ‘factor’ of authentication involves having another means by which a user’s identify can be confirmed when logging in to an online database. Historically the most common 2nd factor of authentication is to provide users with a small electronic device called a ‘token’ which generates a regularly changing random number which must be entered at the same time as logging in with a username and password. There is another piece of software loaded into the online system which confirms the validity of the random number and this acts as a second stage of authentication. Therefore the user must have both the ‘token’ device as well as a valid username and password in order to access the system.
  • Millnet have a superior two factor authentication security approach to that of the ‘token’. This involves issuing nominated users with a USB ‘stick’ (i.e. device that plugs into the PC or laptop) which incorporates a fingerprint reader. The user’s fingerprint is stored on the USB in an encrypted form and when the user logs in using their username and password they must also swipe their fingerprint, which in turns causes an additional code to be sent for verification of the user’s identity. At no point is the user’s fingerprint stored by or made available to Millnet – it is simply stored on the key device which is held by the user. This approach means that the user must have not only the device in their possession but it must be their fingerprint used alongside their username and login in order to gain access to the system.

«top

Unitisation The process of splitting image files received in multiple page formats down into individual ‘documents’. Unitisation is a standard part of the Millnet’s process when scanning paper documents whereby experienced scanning operators apply agreed rules to split the scanned images at ‘document ‘ level. Defining what constitutes a document can have a critical bearing on review, disclosure and bundle creation efficiency.For instance, one tactic often employed by law firms attempting deliberately (or sometime inadvertently) to be unhelpful on disclosure is to provide electronic files (normally pdf and sometimes tiff image files) where each file equates to a lever arch file of paper documents. In order to be able to sort, search, organise and categorise the individual documents contained within these large ‘multi-page’ image files, there will be a cost associated with firstly unitising and then also OCR’ing each multi-page file. Further, it may also become necessary to code the files.

To put this into perspective, it will typically cost the disclosing law firm around £30-£50 per lever arch file to scan the files into one large multi-page, non searchable image file whereas it will cost the receiving law firm typically a further £50-£100 per original lever arch file to apply further processes in order to make the documents usable. If the disclosing party had already performed these processing steps and had opted to disclose in such a format so as to be deliberately unhelpful, the result is a duplication of cost and wasted time inconsistent with Part 31 CPR and the Practice Direction. By contrast, if the disclosing firm did not perform these actions, they will put themselves at a disadvantage by not having access to the documents in the most efficient and usable format should the receiving law firm go ahead and perform the additional processing steps noted above.

«top

User(s) Whilst the concept of a ‘user’ is a seemingly straightforward one it is often complicated by the distinction between ‘named’ ‘generic’ and ‘concurrent’ users. A ‘user’ is someone who is given access to the hosted database for review, search or other purposes. A ‘username’ and a password will be associated with a ‘named’ user that normally corresponds with that user’s name. (Millnet’s format for usernames or named users is firstname.surname).‘Generic users’ are where a username and password is allocated for use by more than one person. There will typically be some classes of potential users of a hosted review system that will have a minimal and / or sporadic requirement to use the system. For instance, it may be desirable to provide the end client with a username and password although on the basis that they will not be performing review actions such as searching and classifying documents. The main rationale for providing a ‘generic’ username is to save the additional costs associated with each user. The disadvantage of a generic username that may be shared between different individuals is that all actions taken within the hosted database are logged so as to have a complete ‘audit trail’. To the extent that actions are taken by an individual using a generic username, there is a risk that, should it become necessary to identify the individual who took specific actions in the database such as classifying documents, it may be difficult to do so.

The concept of ‘concurrent’ users can cause some confusion when estimating the per user costs associated with a hosted review system. Millnet’s per user pricing is based on the number of usernames allocated whether they are specifically for an individual or ‘generic’ (e.g. lawfirmname#1, lawfirmname#2 etc). The number of users actually accessing and using the system at any point in time (i.e. ‘concurrent users’) is not relevant from the perspective of estimating the per user charges). It is normal for matters to pass through various phases where the number of users needing to access the system concurrently fluctuates. For instance the weeks preceding disclosure often see a peak in demand for the number of users and their concurrent use of the system.

Millnet’s approach is to enable our clients to vary the number of named users on a monthly basis so as to match user costs to peaks and troughs in activity. This does require a degree of active management of the user requirements by someone nominated as the main point of contact between the nominated Millnet Project Manager and the law firm and any other users of the system (e.g. counsel, experts, client etc).

«top

Visualise
/ Visualisation
Refers to expressing data in a visual / pictorial form. For instance, Nuix has a number of visual functions including pictorial representation of email threads and conversations. Other visual representation of data in an e-discovery context includes ‘clustering’ (preventing documents visually grouped together in ‘clusters’ according to the similarity of their content) and ‘thumbnail’ (showing many shrunken pictures simultaneously on the screen as opposed to looking at each in turn) views of picture files which is a more efficient means of reviewing pictures.«top
Webmail Short form for ‘web based email’. IT is an email service intended to be primarily accessed via a web browser (i.e. a software application such as Microsoft Internet Explorer which connects you to the internet) as opposed to via an internal corporate network. Common webmail services include Hotmail, Yahoo and Googlemail. It is common for people to have an office email address (using normal either Microsoft Outlook or Lotus Notes to send / receive emails) as well as a webmail account(s) for sending / receiving personal emails and attachments. If you have a user’s username and password to access a webmail account, it is a relatively easy process for Millnet to extract a copy of all emails from the account into a hosted database format that can be used to then easily search, review, classify etc the emails and attachments thereto. «top
Workflow Workflow refers to the approach adopted to efficiently undertake all of the steps required in the end to end document management process for a particular matter. Ideally workflow should take into account all steps from identification of where to search through to presentation of documents at trial.One of the key areas of workflow that can have a significant bearing on the overall cost of any matter involving the review of large numbers of documents is the approach adopted to using a hosted review system. Key considerations taking into account many of the factors mentioned elsewhere in this glossary including:

  • Organising documents in logical way. Common ways of organising documents include by custodian, firm or location and into chronological batches. By utilising the latest technology, documents may also be organised by issue or keywords.
  • Batching documents for reviewers to review. The objective is to maximise reviewer efficiency (documents reviewed and tagged per hour) whilst minimising risk (tagging documents incorrectly). Consideration should be given to how to group documents into batches so that the review is logical (such as in chronological order or by responsiveness to keywords) and to take into account document families and the existence of duplicates, near duplicates and email threads.
  • Reporting. Most hosted review databases have reporting features to analyse the progress and productivity of reviewers and to assess the overall rate of progress and volume of documents tagged to various classifications.
  • Review approach. Having batched documents for review, what tagging will reviewers undertake and will there be one, two or possibly more passes over the same documents? For instance, a ‘first pass’ review may only classify documents as being relevant, irrelevant and possibly privileged. The second review of relevant documents may then classify documents by issues with possibly a third pass of particular ‘hot’ documents or those where there was any uncertainty as to classification during the first or second pass reviews.
  • Quality control. What processes will be incorporated into the review to ensure that documents are correctly classified and to ensure no privileged documents are inadvertently disclosed?
  • The potential requirement for redacting documents should be considered. Most hosted review databases incorporate the ability to redact; however, consideration will need to be given as to what to redact, who should perform redactions and whether there is a requirement to distinguish between different types of redactions.

These are but a few of the planning and process considerations that go into designing an efficient workflow.

An efficient approach to workflow can provide a tactical advantage by enabling one firm to cost effectively review a larger collection of electronic documents than the opponent(s). By doing so, the more efficient firm may uncover evidence that is advantageous or may otherwise have been problematic if received from the opposing firm(s) on disclosure; the risk of ‘unknown unknowns’ may be reduced and the firm may have a tactical advantage to the extent that issues relating to the scope of search and disclosure arise. There may also be tactical advantages such as ensuring deadlines are met, being efficient in court and having adopted a wider search at a lower cost if arguments over proportionality are raised and when it comes to the awarding of costs.

«top

XML Extensible markup language. Code which describes the content of data. A subset of SGML that is used to describe the structure and content of documents. The “extensible” part of its name indicates that it can be used to create new data structures, which makes it more powerful than HTML.«top
Yottabyte Yottabyte. A yottabyte is a unit of information or computer storage equal to one septillion (or 1024) bytes (one quadrillion gigabytes). It is commonly abbreviated YB. As of 2010, no system has yet achieved one yottabyte of storage. In fact, the combined space of all the computer hard drives in the entire world does not amount to even one yottabyte. According to one study, all the world’s computers stored approximately 160 exabytes in 2006. As of 2009 the entire internet was estimated to contain close to 500 exabytes or half a zettabyte. There are one thousand zettabytes in a yottabyte: kilobyte (103), megabyte (106), gigabyte (109), terabyte (1012),

petabyte (1015), exabyte (1018), zettabyte (1021), yottabyte (1024)

A yottabyte is 1,000,000,000,000,000,000,000,000 bytes

«top

Zip file A common file compression format that allows quick and easy storage for transport. Compresses and combines one or more documents by utilizing an algorithm that ‘removes’ white space and replaces it when decompression takes place. Commonly used to combine and send large documents via email.«top
Go to: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z