From 8a2fc7f3949873ca97069d89a39e48a8982b6b68 Mon Sep 17 00:00:00 2001 From: charlet Date: Mon, 5 Sep 2005 07:54:36 +0000 Subject: [PATCH] 2005-09-01 Laurent Pautet * g-pehage.ads, g-pehage.adb (Select_Char_Position): When no character position set is provided, we compute one in order to reduce the maximum length of the keys. This computation first selects a character position between 1 and the minimum length of the keys in order to avoid reducing one of the keys to an empty string. (Initialize, Compute): When the ratio V to K is too low, the algorithm does not converge. The initialization procedure now comes with a maximum number of iterations such that when exceeded, an exception is raised in Compute. The user can initialize this ratio to another value and try again Reformating and updated headers. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@103867 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ada/g-pehage.adb | 1384 ++++++++++++++++++++++++++------------------------ gcc/ada/g-pehage.ads | 190 +++---- 2 files changed, 819 insertions(+), 755 deletions(-) diff --git a/gcc/ada/g-pehage.adb b/gcc/ada/g-pehage.adb index 4cd2d24..a782648 100644 --- a/gcc/ada/g-pehage.adb +++ b/gcc/ada/g-pehage.adb @@ -6,7 +6,7 @@ -- -- -- B o d y -- -- -- --- Copyright (C) 2002-2004 Ada Core Technologies, Inc. -- +-- Copyright (C) 2002-2005 Ada Core Technologies, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -40,24 +40,24 @@ with GNAT.Table; package body GNAT.Perfect_Hash_Generators is - -- We are using the algorithm of J. Czech as described in Zbigniew - -- J. Czech, George Havas, and Bohdan S. Majewski ``An Optimal - -- Algorithm for Generating Minimal Perfect Hash Functions'', - -- Information Processing Letters, 43(1992) pp.257-264, Oct.1992 + -- We are using the algorithm of J. Czech as described in Zbigniew J. + -- Czech, George Havas, and Bohdan S. Majewski ``An Optimal Algorithm for + -- Generating Minimal Perfect Hash Functions'', Information Processing + -- Letters, 43(1992) pp.257-264, Oct.1992 - -- This minimal perfect hash function generator is based on random - -- graphs and produces a hash function of the form: + -- This minimal perfect hash function generator is based on random graphs + -- and produces a hash function of the form: -- h (w) = (g (f1 (w)) + g (f2 (w))) mod m - -- where f1 and f2 are functions that map strings into integers, - -- and g is a function that maps integers into [0, m-1]. h can be - -- order preserving. For instance, let W = {w_0, ..., w_i, ..., + -- where f1 and f2 are functions that map strings into integers, and g is a + -- function that maps integers into [0, m-1]. h can be order preserving. + -- For instance, let W = {w_0, ..., w_i, ..., -- w_m-1}, h can be defined such that h (w_i) = i. - -- This algorithm defines two possible constructions of f1 and - -- f2. Method b) stores the hash function in less memory space at - -- the expense of greater CPU time. + -- This algorithm defines two possible constructions of f1 and f2. Method + -- b) stores the hash function in less memory space at the expense of + -- greater CPU time. -- a) fk (w) = sum (for i in 1 .. length (w)) (Tk (i, w (i))) mod n @@ -65,36 +65,33 @@ package body GNAT.Perfect_Hash_Generators is -- b) fk (w) = sum (for i in 1 .. length (w)) (Tk (i) * w (i)) mod n - -- size (Tk) = max (for w in W) (length (w)) but the table - -- lookups are replaced by multiplications. - - -- where Tk values are randomly generated. n is defined later on - -- but the algorithm recommends to use a value a little bit - -- greater than 2m. Note that for large values of m, the main - -- memory space requirements comes from the memory space for - -- storing function g (>= 2m entries). - - -- Random graphs are frequently used to solve difficult problems - -- that do not have polynomial solutions. This algorithm is based - -- on a weighted undirected graph. It comprises two steps: mapping - -- and assigment. - - -- In the mapping step, a graph G = (V, E) is constructed, where V - -- = {0, 1, ..., n-1} and E = {(for w in W) (f1 (w), f2 (w))}. In - -- order for the assignment step to be successful, G has to be - -- acyclic. To have a high probability of generating an acyclic - -- graph, n >= 2m. If it is not acyclic, Tk have to be regenerated. - - -- In the assignment step, the algorithm builds function g. As G - -- is acyclic, there is a vertex v1 with only one neighbor v2. Let - -- w_i be the word such that v1 = f1 (w_i) and v2 = f2 (w_i). Let - -- g (v1) = 0 by construction and g (v2) = (i - g (v1)) mod n (or - -- to be general, (h (i) - g (v1) mod n). If word w_j is such that - -- v2 = f1 (w_j) and v3 = f2 (w_j), g (v3) = (j - g (v2)) mod n - -- (or to be general, (h (j) - g (v2)) mod n). If w_i has no - -- neighbor, then another vertex is selected. The algorithm - -- traverses G to assign values to all the vertices. It cannot - -- assign a value to an already assigned vertex as G is acyclic. + -- size (Tk) = max (for w in W) (length (w)) but the table lookups are + -- replaced by multiplications. + + -- where Tk values are randomly generated. n is defined later on but the + -- algorithm recommends to use a value a little bit greater than 2m. Note + -- that for large values of m, the main memory space requirements comes + -- from the memory space for storing function g (>= 2m entries). + + -- Random graphs are frequently used to solve difficult problems that do + -- not have polynomial solutions. This algorithm is based on a weighted + -- undirected graph. It comprises two steps: mapping and assigment. + + -- In the mapping step, a graph G = (V, E) is constructed, where = {0, 1, + -- ..., n-1} and E = {(for w in W) (f1 (w), f2 (w))}. In order for the + -- assignment step to be successful, G has to be acyclic. To have a high + -- probability of generating an acyclic graph, n >= 2m. If it is not + -- acyclic, Tk have to be regenerated. + + -- In the assignment step, the algorithm builds function g. As is acyclic, + -- there is a vertex v1 with only one neighbor v2. Let w_i be the word such + -- that v1 = f1 (w_i) and v2 = f2 (w_i). Let g (v1) = 0 by construction and + -- g (v2) = (i - g (v1)) mod n (or to be general, (h (i) - g (v1) mod n). + -- If word w_j is such that v2 = f1 (w_j) and v3 = f2 (w_j), g (v3) = (j - + -- g (v2)) mod (or to be general, (h (j) - g (v2)) mod n). If w_i has no + -- neighbor, then another vertex is selected. The algorithm traverses G to + -- assign values to all the vertices. It cannot assign a value to an + -- already assigned vertex as G is acyclic. subtype Word_Id is Integer; subtype Key_Id is Integer; @@ -109,42 +106,44 @@ package body GNAT.Perfect_Hash_Generators is Max_Word_Length : constant := 32; subtype Word_Type is String (1 .. Max_Word_Length); Null_Word : constant Word_Type := (others => ASCII.NUL); - -- Store keyword in a word. Note that the length of word is - -- limited to 32 characters. + -- Store keyword in a word. Note that the length of word is limited to 32 + -- characters. type Key_Type is record Edge : Edge_Id; end record; - -- A key corresponds to an edge in the algorithm graph. + -- A key corresponds to an edge in the algorithm graph type Vertex_Type is record First : Edge_Id; Last : Edge_Id; end record; - -- A vertex can be involved in several edges. First and Last are - -- the bounds of an array of edges stored in a global edge table. + -- A vertex can be involved in several edges. First and Last are the bounds + -- of an array of edges stored in a global edge table. type Edge_Type is record X : Vertex_Id; Y : Vertex_Id; Key : Key_Id; end record; - -- An edge is a peer of vertices. In the algorithm, a key - -- is associated to an edge. + -- An edge is a peer of vertices. In the algorithm, a key is associated to + -- an edge. package WT is new GNAT.Table (Word_Type, Word_Id, 0, 32, 32); package IT is new GNAT.Table (Integer, Integer, 0, 32, 32); - -- The two main tables. IT is used to store several tables of - -- components containing only integers. + -- The two main tables. IT is used to store several tables of components + -- containing only integers. function Image (Int : Integer; W : Natural := 0) return String; function Image (Str : String; W : Natural := 0) return String; - -- Return a string which includes string Str or integer Int - -- preceded by leading spaces if required by width W. + -- Return a string which includes string Str or integer Int preceded by + -- leading spaces if required by width W. Output : File_Descriptor renames GNAT.OS_Lib.Standout; -- Shortcuts + EOL : constant Character := ASCII.LF; + Max : constant := 78; Last : Natural := 0; Line : String (1 .. Max); @@ -163,24 +162,23 @@ package body GNAT.Perfect_Hash_Generators is F2 : Natural; L2 : Natural; C2 : Natural); - -- Write string S into file F as a element of an array of one or - -- two dimensions. Fk (resp. Lk and Ck) indicates the first (resp - -- last and current) index in the k-th dimension. If F1 = L1 the - -- array is considered as a one dimension array. This dimension is - -- described by F2 and L2. This routine takes care of all the - -- parenthesis, spaces and commas needed to format correctly the - -- array. Moreover, the array is well indented and is wrapped to - -- fit in a 80 col line. When the line is full, the routine writes - -- it into file F. When the array is completed, the routine adds a + -- Write string S into file F as a element of an array of one or two + -- dimensions. Fk (resp. Lk and Ck) indicates the first (resp last and + -- current) index in the k-th dimension. If F1 = L1 the array is considered + -- as a one dimension array. This dimension is described by F2 and L2. This + -- routine takes care of all the parenthesis, spaces and commas needed to + -- format correctly the array. Moreover, the array is well indented and is + -- wrapped to fit in a 80 col line. When the line is full, the routine + -- writes it into file F. When the array is completed, the routine adds -- semi-colon and writes the line into file F. procedure New_Line - (F : File_Descriptor); + (File : File_Descriptor); -- Simulate Ada.Text_IO.New_Line with GNAT.OS_Lib procedure Put - (F : File_Descriptor; - S : String); + (File : File_Descriptor; + Str : String); -- Simulate Ada.Text_IO.Put with GNAT.OS_Lib procedure Put_Used_Char_Set @@ -191,16 +189,18 @@ package body GNAT.Perfect_Hash_Generators is procedure Put_Int_Vector (File : File_Descriptor; Title : String; - Root : Integer; + Vector : Integer; Length : Natural); -- Output a title and a vector procedure Put_Int_Matrix (File : File_Descriptor; Title : String; - Table : Table_Id); - -- Output a title and a matrix. When the matrix has only one - -- non-empty dimension, it is output as a vector. + Table : Table_Id; + Len_1 : Natural; + Len_2 : Natural); + -- Output a title and a matrix. When the matrix has only one non-empty + -- dimension (Len_2 = 0), output a vector. procedure Put_Edges (File : File_Descriptor; @@ -226,82 +226,79 @@ package body GNAT.Perfect_Hash_Generators is -- Character Position Selection -- ---------------------------------- - -- We reduce the maximum key size by selecting representative - -- positions in these keys. We build a matrix with one word per - -- line. We fill the remaining space of a line with ASCII.NUL. The - -- heuristic selects the position that induces the minimum number - -- of collisions. If there are collisions, select another position - -- on the reduced key set responsible of the collisions. Apply the - -- heuristic until there is no more collision. + -- We reduce the maximum key size by selecting representative positions + -- in these keys. We build a matrix with one word per line. We fill the + -- remaining space of a line with ASCII.NUL. The heuristic selects the + -- position that induces the minimum number of collisions. If there are + -- collisions, select another position on the reduced key set responsible + -- of the collisions. Apply the heuristic until there is no more collision. procedure Apply_Position_Selection; -- Apply Position selection and build the reduced key table procedure Parse_Position_Selection (Argument : String); - -- Parse Argument and compute the position set. Argument is a - -- list of substrings separated by commas. Each substring - -- represents a position or a range of positions (like x-y). + -- Parse Argument and compute the position set. Argument is list of + -- substrings separated by commas. Each substring represents a position + -- or a range of positions (like x-y). procedure Select_Character_Set; - -- Define an optimized used character set like Character'Pos in - -- order not to allocate tables of 256 entries. + -- Define an optimized used character set like Character'Pos in order not + -- to allocate tables of 256 entries. procedure Select_Char_Position; - -- Find a min char position set in order to reduce the max key - -- length. The heuristic selects the position that induces the - -- minimum number of collisions. If there are collisions, select - -- another position on the reduced key set responsible of the - -- collisions. Apply the heuristic until there is no collision. + -- Find a min char position set in order to reduce the max key length. The + -- heuristic selects the position that induces the minimum number of + -- collisions. If there are collisions, select another position on the + -- reduced key set responsible of the collisions. Apply the heuristic until + -- there is no collision. ----------------------------- -- Random Graph Generation -- ----------------------------- procedure Random (Seed : in out Natural); - -- Simulate Ada.Discrete_Numerics.Random. + -- Simulate Ada.Discrete_Numerics.Random procedure Generate_Mapping_Table - (T : Table_Id; - L1 : Natural; - L2 : Natural; - S : in out Natural); - -- Random generation of the tables below. T is already allocated. + (Tab : Table_Id; + L1 : Natural; + L2 : Natural; + Seed : in out Natural); + -- Random generation of the tables below. T is already allocated procedure Generate_Mapping_Tables - (Opt : Optimization; - S : in out Natural); - -- Generate the mapping tables T1 and T2. They are used to define : - -- fk (w) = sum (for i in 1 .. length (w)) (Tk (i, w (i))) mod n. - -- Keys, NK and Chars are used to compute the matrix size. + (Opt : Optimization; + Seed : in out Natural); + -- Generate the mapping tables T1 and T2. They are used to define fk (w) = + -- sum (for i in 1 .. length (w)) (Tk (i, w (i))) mod n. Keys, NK and Chars + -- are used to compute the matrix size. --------------------------- -- Algorithm Computation -- --------------------------- procedure Compute_Edges_And_Vertices (Opt : Optimization); - -- Compute the edge and vertex tables. These are empty when a self - -- loop is detected (f1 (w) = f2 (w)). The edge table is sorted by - -- X value and then Y value. Keys is the key table and NK the - -- number of keys. Chars is the set of characters really used in - -- Keys. NV is the number of vertices recommended by the - -- algorithm. T1 and T2 are the mapping tables needed to compute - -- f1 (w) and f2 (w). + -- Compute the edge and vertex tables. These are empty when a self loop is + -- detected (f1 (w) = f2 (w)). The edge table is sorted by X value and then + -- Y value. Keys is the key table and NK the number of keys. Chars is the + -- set of characters really used in Keys. NV is the number of vertices + -- recommended by the algorithm. T1 and T2 are the mapping tables needed to + -- compute f1 (w) and f2 (w). function Acyclic return Boolean; - -- Return True when the graph is acyclic. Vertices is the current - -- vertex table and Edges the current edge table. + -- Return True when the graph is acyclic. Vertices is the current vertex + -- table and Edges the current edge table. procedure Assign_Values_To_Vertices; - -- Execute the assignment step of the algorithm. Keys is the - -- current key table. Vertices and Edges represent the random - -- graph. G is the result of the assignment step such that: + -- Execute the assignment step of the algorithm. Keys is the current key + -- table. Vertices and Edges represent the random graph. G is the result of + -- the assignment step such that: -- h (w) = (g (f1 (w)) + g (f2 (w))) mod m function Sum (Word : Word_Type; Table : Table_Id; - Opt : Optimization) - return Natural; + Opt : Optimization) return Natural; -- For an optimization of CPU_Time return -- fk (w) = sum (for i in 1 .. length (w)) (Tk (i, w (i))) mod n -- For an optimization of Memory_Space return @@ -312,16 +309,18 @@ package body GNAT.Perfect_Hash_Generators is -- Internal Table Management -- ------------------------------- - function Allocate (N : Natural; S : Natural) return Table_Id; - -- procedure Deallocate (N : Natural; S : Natural); + function Allocate (N : Natural; S : Natural := 1) return Table_Id; + -- Allocate N * S ints from IT table + + procedure Free_Tmp_Tables; + -- Deallocate the tables used by the algorithm (but not the keys table) ---------- -- Keys -- ---------- - Key_Size : constant := 1; - Keys : Table_Id := No_Table; - NK : Natural; + Keys : Table_Id := No_Table; + NK : Natural := 0; -- NK : Number of Keys function Initial (K : Key_Id) return Word_Id; @@ -330,64 +329,63 @@ package body GNAT.Perfect_Hash_Generators is function Reduced (K : Key_Id) return Word_Id; pragma Inline (Reduced); - function Get_Key (F : Key_Id) return Key_Type; - procedure Set_Key (F : Key_Id; Item : Key_Type); - -- Comments needed here ??? + function Get_Key (N : Key_Id) return Key_Type; + procedure Set_Key (N : Key_Id; Item : Key_Type); + -- Get or Set Nth element of Keys table ------------------ -- Char_Pos_Set -- ------------------ - Char_Pos_Size : constant := 1; Char_Pos_Set : Table_Id := No_Table; Char_Pos_Set_Len : Natural; -- Character Selected Position Set function Get_Char_Pos (P : Natural) return Natural; procedure Set_Char_Pos (P : Natural; Item : Natural); - -- Comments needed here ??? + -- Get or Set the string position of the Pth selected character ------------------- -- Used_Char_Set -- ------------------- - Used_Char_Size : constant := 1; Used_Char_Set : Table_Id := No_Table; Used_Char_Set_Len : Natural; - -- Used Character Set : Define a new character mapping. When all - -- the characters are not present in the keys, in order to reduce - -- the size of some tables, we redefine the character mapping. + -- Used Character Set : Define a new character mapping. When all the + -- characters are not present in the keys, in order to reduce the size + -- of some tables, we redefine the character mapping. function Get_Used_Char (C : Character) return Natural; procedure Set_Used_Char (C : Character; Item : Natural); - ------------------- - -- Random Tables -- - ------------------- + ------------ + -- Tables -- + ------------ - Rand_Tab_Item_Size : constant := 1; - T1 : Table_Id := No_Table; - T2 : Table_Id := No_Table; - Rand_Tab_Len_1 : Natural; - Rand_Tab_Len_2 : Natural; + T1 : Table_Id := No_Table; + T2 : Table_Id := No_Table; + T1_Len : Natural; + T2_Len : Natural; -- T1 : Values table to compute F1 -- T2 : Values table to compute F2 - function Get_Rand_Tab (T : Integer; X, Y : Natural) return Natural; - procedure Set_Rand_Tab (T : Integer; X, Y : Natural; Item : Natural); + function Get_Table (T : Integer; X, Y : Natural) return Natural; + procedure Set_Table (T : Integer; X, Y : Natural; Item : Natural); - ------------------ - -- Random Graph -- - ------------------ + ----------- + -- Graph -- + ----------- - Graph_Item_Size : constant := 1; - G : Table_Id := No_Table; - Graph_Len : Natural; - -- G : Values table to compute G + G : Table_Id := No_Table; + G_Len : Natural; + -- Values table to compute G - function Get_Graph (F : Natural) return Integer; - procedure Set_Graph (F : Natural; Item : Integer); - -- Comments needed ??? + NT : Natural := Default_Tries; + -- Number of tries running the algorithm before raising an error + + function Get_Graph (N : Natural) return Integer; + procedure Set_Graph (N : Natural; Item : Integer); + -- Get or Set Nth element of graph ----------- -- Edges -- @@ -423,8 +421,9 @@ package body GNAT.Perfect_Hash_Generators is Opt : Optimization; -- Optimization mode (memory vs CPU) - MKL : Natural; - -- Maximum of all the word length + Max_Key_Len : Natural := 0; + Min_Key_Len : Natural := Max_Word_Length; + -- Maximum and minimum of all the word length S : Natural; -- Seed @@ -436,26 +435,23 @@ package body GNAT.Perfect_Hash_Generators is -- Acyclic -- ------------- - function Acyclic return Boolean - is + function Acyclic return Boolean is Marks : array (0 .. NV - 1) of Vertex_Id := (others => No_Vertex); function Traverse - (Edge : Edge_Id; - Mark : Vertex_Id) - return Boolean; - -- Propagate Mark from X to Y. X is already marked. Mark Y and - -- propagate it to the edges of Y except the one representing - -- the same key. Return False when Y is marked with Mark. + (Edge : Edge_Id; + Mark : Vertex_Id) return Boolean; + -- Propagate Mark from X to Y. X is already marked. Mark Y and propagate + -- it to the edges of Y except the one representing the same key. Return + -- False when Y is marked with Mark. -------------- -- Traverse -- -------------- function Traverse - (Edge : Edge_Id; - Mark : Vertex_Id) - return Boolean + (Edge : Edge_Id; + Mark : Vertex_Id) return Boolean is E : constant Edge_Type := Get_Edges (Edge); K : constant Key_Id := E.Key; @@ -473,7 +469,7 @@ package body GNAT.Perfect_Hash_Generators is for J in V.First .. V.Last loop - -- Do not propagate to the edge representing the same key. + -- Do not propagate to the edge representing the same key if Get_Edges (J).Key /= K and then not Traverse (J, Mark) @@ -531,7 +527,6 @@ package body GNAT.Perfect_Hash_Generators is procedure Add (S : String) is Len : constant Natural := S'Length; - begin Line (Last + 1 .. Last + Len) := S; Last := Last + Len; @@ -541,9 +536,8 @@ package body GNAT.Perfect_Hash_Generators is -- Allocate -- -------------- - function Allocate (N : Natural; S : Natural) return Table_Id is + function Allocate (N : Natural; S : Natural := 1) return Table_Id is L : constant Integer := IT.Last; - begin IT.Set_Last (L + N * S); return L + 1; @@ -555,7 +549,7 @@ package body GNAT.Perfect_Hash_Generators is procedure Apply_Position_Selection is begin - WT.Set_Last (2 * NK - 1); + WT.Set_Last (2 * NK); for J in 0 .. NK - 1 loop declare I_Word : constant Word_Type := WT.Table (Initial (J)); @@ -563,8 +557,8 @@ package body GNAT.Perfect_Hash_Generators is Index : Natural := I_Word'First - 1; begin - -- Select the characters of Word included in the - -- position selection. + -- Select the characters of Word included in the position + -- selection. for C in 0 .. Char_Pos_Set_Len - 1 loop exit when I_Word (Get_Char_Pos (C)) = ASCII.NUL; @@ -580,56 +574,6 @@ package body GNAT.Perfect_Hash_Generators is end loop; end Apply_Position_Selection; - ------------- - -- Compute -- - ------------- - - procedure Compute (Position : String := Default_Position) is - begin - Keys := Allocate (NK, Key_Size); - - if Verbose then - Put_Initial_Keys (Output, "Initial Key Table"); - end if; - - if Position'Length /= 0 then - Parse_Position_Selection (Position); - else - Select_Char_Position; - end if; - - if Verbose then - Put_Int_Vector - (Output, "Char Position Set", Char_Pos_Set, Char_Pos_Set_Len); - end if; - - Apply_Position_Selection; - - if Verbose then - Put_Reduced_Keys (Output, "Reduced Keys Table"); - end if; - - Select_Character_Set; - - if Verbose then - Put_Used_Char_Set (Output, "Character Position Table"); - end if; - - -- Perform Czech's algorithm - - loop - Generate_Mapping_Tables (Opt, S); - Compute_Edges_And_Vertices (Opt); - - -- When graph is not empty (no self-loop from previous - -- operation) and not acyclic. - - exit when 0 < Edges_Len and then Acyclic; - end loop; - - Assign_Values_To_Vertices; - end Compute; - ------------------------------- -- Assign_Values_To_Vertices -- ------------------------------- @@ -638,8 +582,8 @@ package body GNAT.Perfect_Hash_Generators is X : Vertex_Id; procedure Assign (X : Vertex_Id); - -- Execute assignment on X's neighbors except the vertex that - -- we are coming from which is already assigned. + -- Execute assignment on X's neighbors except the vertex that we are + -- coming from which is already assigned. ------------ -- Assign -- @@ -649,7 +593,6 @@ package body GNAT.Perfect_Hash_Generators is is E : Edge_Type; V : constant Vertex_Type := Get_Vertices (X); - begin for J in V.First .. V.Last loop E := Get_Edges (J); @@ -667,11 +610,11 @@ package body GNAT.Perfect_Hash_Generators is -- be in the range 0 .. NK. if G = No_Table then - Graph_Len := NV; - G := Allocate (Graph_Len, Graph_Item_Size); + G_Len := NV; + G := Allocate (G_Len, 1); end if; - for J in 0 .. Graph_Len - 1 loop + for J in 0 .. G_Len - 1 loop Set_Graph (J, -1); end loop; @@ -684,17 +627,80 @@ package body GNAT.Perfect_Hash_Generators is end if; end loop; - for J in 0 .. Graph_Len - 1 loop + for J in 0 .. G_Len - 1 loop if Get_Graph (J) = -1 then Set_Graph (J, 0); end if; end loop; if Verbose then - Put_Int_Vector (Output, "Assign Values To Vertices", G, Graph_Len); + Put_Int_Vector (Output, "Assign Values To Vertices", G, G_Len); end if; end Assign_Values_To_Vertices; + ------------- + -- Compute -- + ------------- + + procedure Compute + (Position : String := Default_Position) + is + Success : Boolean := False; + + begin + NV := Natural (K2V * Float (NK)); + + Keys := Allocate (NK); + + if Verbose then + Put_Initial_Keys (Output, "Initial Key Table"); + end if; + + if Position'Length /= 0 then + Parse_Position_Selection (Position); + else + Select_Char_Position; + end if; + + if Verbose then + Put_Int_Vector + (Output, "Char Position Set", Char_Pos_Set, Char_Pos_Set_Len); + end if; + + Apply_Position_Selection; + + if Verbose then + Put_Reduced_Keys (Output, "Reduced Keys Table"); + end if; + + Select_Character_Set; + + if Verbose then + Put_Used_Char_Set (Output, "Character Position Table"); + end if; + + -- Perform Czech's algorithm + + for J in 1 .. NT loop + Generate_Mapping_Tables (Opt, S); + Compute_Edges_And_Vertices (Opt); + + -- When graph is not empty (no self-loop from previous operation) and + -- not acyclic. + + if 0 < Edges_Len and then Acyclic then + Success := True; + exit; + end if; + end loop; + + if not Success then + raise Too_Many_Tries; + end if; + + Assign_Values_To_Vertices; + end Compute; + -------------------------------- -- Compute_Edges_And_Vertices -- -------------------------------- @@ -711,15 +717,6 @@ package body GNAT.Perfect_Hash_Generators is function Lt (L, R : Natural) return Boolean; -- Subprograms needed for GNAT.Heap_Sort_A - ---------- - -- Move -- - ---------- - - procedure Move (From : Natural; To : Natural) is - begin - Set_Edges (To, Get_Edges (From)); - end Move; - -------- -- Lt -- -------- @@ -727,16 +724,24 @@ package body GNAT.Perfect_Hash_Generators is function Lt (L, R : Natural) return Boolean is EL : constant Edge_Type := Get_Edges (L); ER : constant Edge_Type := Get_Edges (R); - begin return EL.X < ER.X or else (EL.X = ER.X and then EL.Y < ER.Y); end Lt; + ---------- + -- Move -- + ---------- + + procedure Move (From : Natural; To : Natural) is + begin + Set_Edges (To, Get_Edges (From)); + end Move; + -- Start of processing for Compute_Edges_And_Vertices begin - -- We store edges from 1 to 2 * NK and leave - -- zero alone in order to use GNAT.Heap_Sort_A. + -- We store edges from 1 to 2 * NK and leave zero alone in order to use + -- GNAT.Heap_Sort_A. Edges_Len := 2 * NK + 1; @@ -783,14 +788,16 @@ package body GNAT.Perfect_Hash_Generators is else if Verbose then Put_Edges (Output, "Unsorted Edge Table"); - Put_Int_Matrix (Output, "Function Table 1", T1); - Put_Int_Matrix (Output, "Function Table 2", T2); + Put_Int_Matrix (Output, "Function Table 1", T1, + T1_Len, T2_Len); + Put_Int_Matrix (Output, "Function Table 2", T2, + T1_Len, T2_Len); end if; - -- Enforce consistency between edges and keys. Construct - -- Vertices and compute the list of neighbors of a vertex - -- First .. Last as Edges is sorted by X and then Y. To - -- compute the neighbor list, sort the edges. + -- Enforce consistency between edges and keys. Construct Vertices and + -- compute the list of neighbors of a vertex First .. Last as Edges + -- is sorted by X and then Y. To compute the neighbor list, sort the + -- edges. Sort (Edges_Len - 1, @@ -799,8 +806,10 @@ package body GNAT.Perfect_Hash_Generators is if Verbose then Put_Edges (Output, "Sorted Edge Table"); - Put_Int_Matrix (Output, "Function Table 1", T1); - Put_Int_Matrix (Output, "Function Table 2", T2); + Put_Int_Matrix (Output, "Function Table 1", T1, + T1_Len, T2_Len); + Put_Int_Matrix (Output, "Function Table 2", T2, + T1_Len, T2_Len); end if; -- Edges valid range is 1 .. 2 * NK @@ -857,8 +866,8 @@ package body GNAT.Perfect_Hash_Generators is when Function_Table_1 | Function_Table_2 => Item_Size := Type_Size (NV); - Length_1 := Rand_Tab_Len_1; - Length_2 := Rand_Tab_Len_2; + Length_1 := T1_Len; + Length_2 := T2_Len; when Graph_Table => Item_Size := Type_Size (NK); @@ -873,11 +882,25 @@ package body GNAT.Perfect_Hash_Generators is procedure Finalize is begin + Free_Tmp_Tables; + WT.Release; IT.Release; + NK := 0; + Max_Key_Len := 0; + Min_Key_Len := Max_Word_Length; + end Finalize; + + --------------------- + -- Free_Tmp_Tables -- + --------------------- + + procedure Free_Tmp_Tables is + begin + IT.Init; + Keys := No_Table; - NK := 0; Char_Pos_Set := No_Table; Char_Pos_Set_Len := 0; @@ -888,34 +911,34 @@ package body GNAT.Perfect_Hash_Generators is T1 := No_Table; T2 := No_Table; - Rand_Tab_Len_1 := 0; - Rand_Tab_Len_2 := 0; + T1_Len := 0; + T2_Len := 0; - G := No_Table; - Graph_Len := 0; + G := No_Table; + G_Len := 0; Edges := No_Table; Edges_Len := 0; - Vertices := No_Table; - NV := 0; - end Finalize; + Vertices := No_Table; + NV := 0; + end Free_Tmp_Tables; ---------------------------- -- Generate_Mapping_Table -- ---------------------------- procedure Generate_Mapping_Table - (T : Integer; - L1 : Natural; - L2 : Natural; - S : in out Natural) + (Tab : Integer; + L1 : Natural; + L2 : Natural; + Seed : in out Natural) is begin for J in 0 .. L1 - 1 loop for K in 0 .. L2 - 1 loop - Random (S); - Set_Rand_Tab (T, J, K, S mod NV); + Random (Seed); + Set_Table (Tab, J, K, Seed mod NV); end loop; end loop; end Generate_Mapping_Table; @@ -925,12 +948,12 @@ package body GNAT.Perfect_Hash_Generators is ----------------------------- procedure Generate_Mapping_Tables - (Opt : Optimization; - S : in out Natural) + (Opt : Optimization; + Seed : in out Natural) is begin - -- If T1 and T2 are already allocated no need to do it - -- twice. Reuse them as their size has not changes. + -- If T1 and T2 are already allocated no need to do it twice. Reuse them + -- as their size has not changed. if T1 = No_Table and then T2 = No_Table then declare @@ -948,22 +971,22 @@ package body GNAT.Perfect_Hash_Generators is end loop; end if; - Rand_Tab_Len_1 := Char_Pos_Set_Len; - Rand_Tab_Len_2 := Used_Char_Last + 1; - T1 := Allocate (Rand_Tab_Len_1 * Rand_Tab_Len_2, - Rand_Tab_Item_Size); - T2 := Allocate (Rand_Tab_Len_1 * Rand_Tab_Len_2, - Rand_Tab_Item_Size); + T1_Len := Char_Pos_Set_Len; + T2_Len := Used_Char_Last + 1; + T1 := Allocate (T1_Len * T2_Len); + T2 := Allocate (T1_Len * T2_Len); end; end if; - Generate_Mapping_Table (T1, Rand_Tab_Len_1, Rand_Tab_Len_2, S); - Generate_Mapping_Table (T2, Rand_Tab_Len_1, Rand_Tab_Len_2, S); + Generate_Mapping_Table (T1, T1_Len, T2_Len, Seed); + Generate_Mapping_Table (T2, T1_Len, T2_Len, Seed); if Verbose then Put_Used_Char_Set (Output, "Used Character Set"); - Put_Int_Matrix (Output, "Function Table 1", T1); - Put_Int_Matrix (Output, "Function Table 2", T2); + Put_Int_Matrix (Output, "Function Table 1", T1, + T1_Len, T2_Len); + Put_Int_Matrix (Output, "Function Table 2", T2, + T1_Len, T2_Len); end if; end Generate_Mapping_Tables; @@ -973,7 +996,6 @@ package body GNAT.Perfect_Hash_Generators is function Get_Char_Pos (P : Natural) return Natural is N : constant Natural := Char_Pos_Set + P; - begin return IT.Table (N); end Get_Char_Pos; @@ -985,7 +1007,6 @@ package body GNAT.Perfect_Hash_Generators is function Get_Edges (F : Natural) return Edge_Type is N : constant Natural := Edges + (F * Edge_Size); E : Edge_Type; - begin E.X := IT.Table (N); E.Y := IT.Table (N + 1); @@ -997,46 +1018,38 @@ package body GNAT.Perfect_Hash_Generators is -- Get_Graph -- --------------- - function Get_Graph (F : Natural) return Integer is - N : constant Natural := G + F * Graph_Item_Size; - + function Get_Graph (N : Natural) return Integer is begin - return IT.Table (N); + return IT.Table (G + N); end Get_Graph; ------------- -- Get_Key -- ------------- - function Get_Key (F : Key_Id) return Key_Type is - N : constant Natural := Keys + F * Key_Size; + function Get_Key (N : Key_Id) return Key_Type is K : Key_Type; - begin - K.Edge := IT.Table (N); + K.Edge := IT.Table (Keys + N); return K; end Get_Key; - ------------------ - -- Get_Rand_Tab -- - ------------------ - - function Get_Rand_Tab (T : Integer; X, Y : Natural) return Natural is - N : constant Natural := - T + ((Y * Rand_Tab_Len_1) + X) * Rand_Tab_Item_Size; + --------------- + -- Get_Table -- + --------------- + function Get_Table (T : Integer; X, Y : Natural) return Natural is + N : constant Natural := T + (Y * T1_Len) + X; begin return IT.Table (N); - end Get_Rand_Tab; + end Get_Table; ------------------- -- Get_Used_Char -- ------------------- function Get_Used_Char (C : Character) return Natural is - N : constant Natural := - Used_Char_Set + Character'Pos (C) * Used_Char_Size; - + N : constant Natural := Used_Char_Set + Character'Pos (C); begin return IT.Table (N); end Get_Used_Char; @@ -1048,7 +1061,6 @@ package body GNAT.Perfect_Hash_Generators is function Get_Vertices (F : Natural) return Vertex_Type is N : constant Natural := Vertices + (F * Vertex_Size); V : Vertex_Type; - begin V.First := IT.Table (N); V.Last := IT.Table (N + 1); @@ -1135,22 +1147,24 @@ package body GNAT.Perfect_Hash_Generators is procedure Initialize (Seed : Natural; K_To_V : Float := Default_K_To_V; - Optim : Optimization := CPU_Time) + Optim : Optimization := CPU_Time; + Tries : Positive := Default_Tries) is begin - WT.Init; - IT.Init; - S := Seed; + -- Free previous tables (the settings may have changed between two runs) - Keys := No_Table; - NK := 0; + Free_Tmp_Tables; - Char_Pos_Set := No_Table; - Char_Pos_Set_Len := 0; + if K_To_V <= 2.0 then + Put (Output, "K to V ratio cannot be lower than 2.0"); + New_Line (Output); + raise Program_Error; + end if; - K2V := K_To_V; - Opt := Optim; - MKL := 0; + S := Seed; + K2V := K_To_V; + Opt := Optim; + NT := Tries; end Initialize; ------------ @@ -1170,8 +1184,19 @@ package body GNAT.Perfect_Hash_Generators is NK := NK + 1; NV := Natural (Float (NK) * K2V); - if MKL < Len then - MKL := Len; + -- Do not accept a value of K2V too close to 2.0 such that once rounded + -- up, NV = 2 * NK because the algorithm would not converge. + + if NV <= 2 * NK then + NV := 2 * NK + 1; + end if; + + if Max_Key_Len < Len then + Max_Key_Len := Len; + end if; + + if Len < Min_Key_Len then + Min_Key_Len := Len; end if; end Insert; @@ -1179,11 +1204,9 @@ package body GNAT.Perfect_Hash_Generators is -- New_Line -- -------------- - procedure New_Line (F : File_Descriptor) is - EOL : constant Character := ASCII.LF; - + procedure New_Line (File : File_Descriptor) is begin - if Write (F, EOL'Address, 1) /= 1 then + if Write (File, EOL'Address, 1) /= 1 then raise Program_Error; end if; end New_Line; @@ -1195,7 +1218,7 @@ package body GNAT.Perfect_Hash_Generators is procedure Parse_Position_Selection (Argument : String) is N : Natural := Argument'First; L : constant Natural := Argument'Last; - M : constant Natural := MKL; + M : constant Natural := Max_Key_Len; T : array (1 .. M) of Boolean := (others => False); @@ -1206,8 +1229,7 @@ package body GNAT.Perfect_Hash_Generators is -- Parse_Index -- ----------------- - function Parse_Index return Natural - is + function Parse_Index return Natural is C : Character := Argument (N); V : Natural := 0; @@ -1235,13 +1257,12 @@ package body GNAT.Perfect_Hash_Generators is -- Start of processing for Parse_Position_Selection begin - Char_Pos_Set_Len := 2 * NK; -- Empty specification means all the positions if L < N then Char_Pos_Set_Len := M; - Char_Pos_Set := Allocate (Char_Pos_Set_Len, Char_Pos_Size); + Char_Pos_Set := Allocate (Char_Pos_Set_Len); for C in 0 .. Char_Pos_Set_Len - 1 loop Set_Char_Pos (C, C + 1); @@ -1292,7 +1313,7 @@ package body GNAT.Perfect_Hash_Generators is -- Fill position selection Char_Pos_Set_Len := N; - Char_Pos_Set := Allocate (Char_Pos_Set_Len, Char_Pos_Size); + Char_Pos_Set := Allocate (Char_Pos_Set_Len); N := 0; for J in T'Range loop @@ -1312,34 +1333,42 @@ package body GNAT.Perfect_Hash_Generators is File : File_Descriptor; Status : Boolean; - -- For call to Close; + -- For call to Close - function Type_Img (L : Natural) return String; - -- Return the larger unsigned type T such that T'Last < L + function Array_Img (N, T, R1 : String; R2 : String := "") return String; + -- Return string "N : constant array (R1[, R2]) of T;" function Range_Img (F, L : Natural; T : String := "") return String; -- Return string "[T range ]F .. L" - function Array_Img (N, T, R1 : String; R2 : String := "") return String; - -- Return string "N : constant array (R1[, R2]) of T;" - - -------------- - -- Type_Img -- - -------------- + function Type_Img (L : Natural) return String; + -- Return the larger unsigned type T such that T'Last < L - function Type_Img (L : Natural) return String is - S : constant String := Image (Type_Size (L)); - U : String := "Unsigned_ "; - N : Natural := 9; + --------------- + -- Array_Img -- + --------------- + function Array_Img + (N, T, R1 : String; + R2 : String := "") return String + is begin - for J in S'Range loop - N := N + 1; - U (N) := S (J); - end loop; + Last := 0; + Add (" "); + Add (N); + Add (" : constant array ("); + Add (R1); - return U (1 .. N); - end Type_Img; + if R2 /= "" then + Add (", "); + Add (R2); + end if; + + Add (") of "); + Add (T); + Add (" :="); + return Line (1 .. Last); + end Array_Img; --------------- -- Range_Img -- @@ -1371,32 +1400,23 @@ package body GNAT.Perfect_Hash_Generators is return RI (1 .. Len); end Range_Img; - --------------- - -- Array_Img -- - --------------- + -------------- + -- Type_Img -- + -------------- - function Array_Img - (N, T, R1 : String; - R2 : String := "") - return String - is - begin - Last := 0; - Add (" "); - Add (N); - Add (" : constant array ("); - Add (R1); + function Type_Img (L : Natural) return String is + S : constant String := Image (Type_Size (L)); + U : String := "Unsigned_ "; + N : Natural := 9; - if R2 /= "" then - Add (", "); - Add (R2); - end if; + begin + for J in S'Range loop + N := N + 1; + U (N) := S (J); + end loop; - Add (") of "); - Add (T); - Add (" :="); - return Line (1 .. Last); - end Array_Img; + return U (1 .. N); + end Type_Img; F : Natural; L : Natural; @@ -1460,7 +1480,7 @@ package body GNAT.Perfect_Hash_Generators is for J in Character'Range loop P := Get_Used_Char (J); - Put (File, Image (P), 0, 0, 0, F, L, Character'Pos (J)); + Put (File, Image (P), 1, 0, 1, F, L, Character'Pos (J)); end loop; New_Line (File); @@ -1473,7 +1493,7 @@ package body GNAT.Perfect_Hash_Generators is New_Line (File); for J in F .. L loop - Put (File, Image (Get_Char_Pos (J)), 0, 0, 0, F, L, J); + Put (File, Image (Get_Char_Pos (J)), 1, 0, 1, F, L, J); end loop; New_Line (File); @@ -1482,17 +1502,16 @@ package body GNAT.Perfect_Hash_Generators is Put_Int_Matrix (File, Array_Img ("T1", Type_Img (NV), - Range_Img (0, Rand_Tab_Len_1 - 1), - Range_Img (0, Rand_Tab_Len_2 - 1, - Type_Img (256))), - T1); + Range_Img (0, T1_Len - 1), + Range_Img (0, T2_Len - 1, Type_Img (256))), + T1, T1_Len, T2_Len); else Put_Int_Matrix (File, Array_Img ("T1", Type_Img (NV), - Range_Img (0, Rand_Tab_Len_1 - 1)), - T1); + Range_Img (0, T1_Len - 1)), + T1, T1_Len, 0); end if; New_Line (File); @@ -1501,17 +1520,16 @@ package body GNAT.Perfect_Hash_Generators is Put_Int_Matrix (File, Array_Img ("T2", Type_Img (NV), - Range_Img (0, Rand_Tab_Len_1 - 1), - Range_Img (0, Rand_Tab_Len_2 - 1, - Type_Img (256))), - T2); + Range_Img (0, T1_Len - 1), + Range_Img (0, T2_Len - 1, Type_Img (256))), + T2, T1_Len, T2_Len); else Put_Int_Matrix (File, Array_Img ("T2", Type_Img (NV), - Range_Img (0, Rand_Tab_Len_1 - 1)), - T2); + Range_Img (0, T1_Len - 1)), + T2, T1_Len, 0); end if; New_Line (File); @@ -1519,8 +1537,8 @@ package body GNAT.Perfect_Hash_Generators is Put_Int_Vector (File, Array_Img ("G", Type_Img (NK), - Range_Img (0, Graph_Len - 1)), - G, Graph_Len); + Range_Img (0, G_Len - 1)), + G, G_Len); New_Line (File); Put (File, " function Hash (S : String) return Natural is"); @@ -1621,11 +1639,11 @@ package body GNAT.Perfect_Hash_Generators is -- Put -- --------- - procedure Put (F : File_Descriptor; S : String) is - Len : constant Natural := S'Length; + procedure Put (File : File_Descriptor; Str : String) is + Len : constant Natural := Str'Length; begin - if Write (F, S'Address, Len) /= Len then + if Write (File, Str'Address, Len) /= Len then raise Program_Error; end if; end Put; @@ -1647,6 +1665,7 @@ package body GNAT.Perfect_Hash_Generators is Len : constant Natural := S'Length; procedure Flush; + -- Write current line, followed by LF ----------- -- Flush -- @@ -1674,9 +1693,12 @@ package body GNAT.Perfect_Hash_Generators is Line (Last + 1 .. Last + 5) := " "; Last := Last + 5; - if F1 /= L1 then + if F1 <= L1 then if C1 = F1 and then C2 = F2 then Add ('('); + if F1 = L1 then + Add ("0 .. 0 => "); + end if; else Add (' '); end if; @@ -1685,6 +1707,9 @@ package body GNAT.Perfect_Hash_Generators is if C2 = F2 then Add ('('); + if F2 = L2 then + Add ("0 .. 0 => "); + end if; else Add (' '); end if; @@ -1695,7 +1720,7 @@ package body GNAT.Perfect_Hash_Generators is if C2 = L2 then Add (')'); - if F1 = L1 then + if F1 > L1 then Add (';'); Flush; elsif C1 /= L1 then @@ -1712,56 +1737,91 @@ package body GNAT.Perfect_Hash_Generators is end if; end Put; - ----------------------- - -- Put_Used_Char_Set -- - ----------------------- + --------------- + -- Put_Edges -- + --------------- - procedure Put_Used_Char_Set + procedure Put_Edges (File : File_Descriptor; Title : String) is - F : constant Natural := Character'Pos (Character'First); - L : constant Natural := Character'Pos (Character'Last); + E : Edge_Type; + F1 : constant Natural := 1; + L1 : constant Natural := Edges_Len - 1; + M : constant Natural := Max / 5; begin Put (File, Title); New_Line (File); - for J in Character'Range loop - Put - (File, Image (Get_Used_Char (J)), 0, 0, 0, F, L, Character'Pos (J)); + -- Edges valid range is 1 .. Edge_Len - 1 + + for J in F1 .. L1 loop + E := Get_Edges (J); + Put (File, Image (J, M), F1, L1, J, 1, 4, 1); + Put (File, Image (E.X, M), F1, L1, J, 1, 4, 2); + Put (File, Image (E.Y, M), F1, L1, J, 1, 4, 3); + Put (File, Image (E.Key, M), F1, L1, J, 1, 4, 4); end loop; - end Put_Used_Char_Set; + end Put_Edges; - ---------- - -- Put -- - ---------- + ---------------------- + -- Put_Initial_Keys -- + ---------------------- + + procedure Put_Initial_Keys + (File : File_Descriptor; + Title : String) + is + F1 : constant Natural := 0; + L1 : constant Natural := NK - 1; + M : constant Natural := Max / 5; + K : Key_Type; + + begin + Put (File, Title); + New_Line (File); + + for J in F1 .. L1 loop + K := Get_Key (J); + Put (File, Image (J, M), F1, L1, J, 1, 3, 1); + Put (File, Image (K.Edge, M), F1, L1, J, 1, 3, 2); + Put (File, WT.Table (Initial (J)), F1, L1, J, 1, 3, 3); + end loop; + end Put_Initial_Keys; + + -------------------- + -- Put_Int_Matrix -- + -------------------- procedure Put_Int_Matrix (File : File_Descriptor; Title : String; - Table : Integer) + Table : Integer; + Len_1 : Natural; + Len_2 : Natural) is - F1 : constant Natural := 0; - L1 : constant Natural := Rand_Tab_Len_1 - 1; - F2 : constant Natural := 0; - L2 : constant Natural := Rand_Tab_Len_2 - 1; + F1 : constant Integer := 0; + L1 : constant Integer := Len_1 - 1; + F2 : constant Integer := 0; + L2 : constant Integer := Len_2 - 1; + I : Natural; begin Put (File, Title); New_Line (File); - if L2 = F2 then + if Len_2 = 0 then for J in F1 .. L1 loop - Put (File, - Image (Get_Rand_Tab (Table, J, F2)), 0, 0, 0, F1, L1, J); + I := IT.Table (Table + J); + Put (File, Image (I), 1, 0, 1, F1, L1, J); end loop; else for J in F1 .. L1 loop for K in F2 .. L2 loop - Put (File, - Image (Get_Rand_Tab (Table, J, K)), F1, L1, J, F2, L2, K); + I := IT.Table (Table + J + K * Len_1); + Put (File, Image (I), F1, L1, J, F2, L2, K); end loop; end loop; end if; @@ -1774,7 +1834,7 @@ package body GNAT.Perfect_Hash_Generators is procedure Put_Int_Vector (File : File_Descriptor; Title : String; - Root : Integer; + Vector : Integer; Length : Natural) is F2 : constant Natural := 0; @@ -1785,43 +1845,15 @@ package body GNAT.Perfect_Hash_Generators is New_Line (File); for J in F2 .. L2 loop - Put (File, Image (IT.Table (Root + J)), 0, 0, 0, F2, L2, J); + Put (File, Image (IT.Table (Vector + J)), 1, 0, 1, F2, L2, J); end loop; end Put_Int_Vector; - --------------- - -- Put_Edges -- - --------------- - - procedure Put_Edges - (File : File_Descriptor; - Title : String) - is - E : Edge_Type; - F1 : constant Natural := 1; - L1 : constant Natural := Edges_Len - 1; - M : constant Natural := Max / 5; - - begin - Put (File, Title); - New_Line (File); - - -- Edges valid range is 1 .. Edge_Len - 1 - - for J in F1 .. L1 loop - E := Get_Edges (J); - Put (File, Image (J, M), F1, L1, J, 1, 4, 1); - Put (File, Image (E.X, M), F1, L1, J, 1, 4, 2); - Put (File, Image (E.Y, M), F1, L1, J, 1, 4, 3); - Put (File, Image (E.Key, M), F1, L1, J, 1, 4, 4); - end loop; - end Put_Edges; - - --------------------------- - -- Put_Initial_Keys -- - --------------------------- + ---------------------- + -- Put_Reduced_Keys -- + ---------------------- - procedure Put_Initial_Keys + procedure Put_Reduced_Keys (File : File_Descriptor; Title : String) is @@ -1838,34 +1870,30 @@ package body GNAT.Perfect_Hash_Generators is K := Get_Key (J); Put (File, Image (J, M), F1, L1, J, 1, 3, 1); Put (File, Image (K.Edge, M), F1, L1, J, 1, 3, 2); - Put (File, WT.Table (Initial (J)), F1, L1, J, 1, 3, 3); + Put (File, WT.Table (Reduced (J)), F1, L1, J, 1, 3, 3); end loop; - end Put_Initial_Keys; + end Put_Reduced_Keys; - --------------------------- - -- Put_Reduced_Keys -- - --------------------------- + ----------------------- + -- Put_Used_Char_Set -- + ----------------------- - procedure Put_Reduced_Keys + procedure Put_Used_Char_Set (File : File_Descriptor; Title : String) is - F1 : constant Natural := 0; - L1 : constant Natural := NK - 1; - M : constant Natural := Max / 5; - K : Key_Type; + F : constant Natural := Character'Pos (Character'First); + L : constant Natural := Character'Pos (Character'Last); begin Put (File, Title); New_Line (File); - for J in F1 .. L1 loop - K := Get_Key (J); - Put (File, Image (J, M), F1, L1, J, 1, 3, 1); - Put (File, Image (K.Edge, M), F1, L1, J, 1, 3, 2); - Put (File, WT.Table (Reduced (J)), F1, L1, J, 1, 3, 3); + for J in Character'Range loop + Put + (File, Image (Get_Used_Char (J)), 1, 0, 1, F, L, Character'Pos (J)); end loop; - end Put_Reduced_Keys; + end Put_Used_Char_Set; ---------------------- -- Put_Vertex_Table -- @@ -1898,8 +1926,8 @@ package body GNAT.Perfect_Hash_Generators is procedure Random (Seed : in out Natural) is - -- Park & Miller Standard Minimal using Schrage's algorithm to - -- avoid overflow: Xn+1 = 16807 * Xn mod (2 ** 31 - 1) + -- Park & Miller Standard Minimal using Schrage's algorithm to avoid + -- overflow: Xn+1 = 16807 * Xn mod (2 ** 31 - 1) R : Natural; Q : Natural; @@ -1923,40 +1951,10 @@ package body GNAT.Perfect_Hash_Generators is function Reduced (K : Key_Id) return Word_Id is begin - return K + NK; + return K + NK + 1; end Reduced; -------------------------- - -- Select_Character_Set -- - -------------------------- - - procedure Select_Character_Set - is - Last : Natural := 0; - Used : array (Character) of Boolean := (others => False); - - begin - for J in 0 .. NK - 1 loop - for K in 1 .. Max_Word_Length loop - exit when WT.Table (Initial (J))(K) = ASCII.NUL; - Used (WT.Table (Initial (J))(K)) := True; - end loop; - end loop; - - Used_Char_Set_Len := 256; - Used_Char_Set := Allocate (Used_Char_Set_Len, Used_Char_Size); - - for J in Used'Range loop - if Used (J) then - Set_Used_Char (J, Last); - Last := Last + 1; - else - Set_Used_Char (J, 0); - end if; - end loop; - end Select_Character_Set; - - -------------------------- -- Select_Char_Position -- -------------------------- @@ -1968,21 +1966,21 @@ package body GNAT.Perfect_Hash_Generators is (Table : in out Vertex_Table_Type; Last : in out Natural; Pos : in Natural); - -- Build a list of keys subsets that are identical with the - -- current position selection plus Pos. Once this routine is - -- called, reduced words are sorted by subsets and each item - -- (First, Last) in Sets defines the range of identical keys. - - function Count_Identical_Keys - (Table : Vertex_Table_Type; - Last : Natural; - Pos : Natural) - return Natural; - -- For each subset in Sets, count the number of identical keys - -- if we add Pos to the current position selection. - - Sel_Position : IT.Table_Type (1 .. MKL); + -- Build a list of keys subsets that are identical with the current + -- position selection plus Pos. Once this routine is called, reduced + -- words are sorted by subsets and each item (First, Last) in Sets + -- defines the range of identical keys. + + function Count_Different_Keys + (Table : Vertex_Table_Type; + Last : Natural; + Pos : Natural) return Natural; + -- For each subset in Sets, count the number of different keys if we add + -- Pos to the current position selection. + + Sel_Position : IT.Table_Type (1 .. Max_Key_Len); Last_Sel_Pos : Natural := 0; + Max_Sel_Pos : Natural := 0; ------------------------------- -- Build_Identical_Keys_Sets -- @@ -2001,123 +1999,122 @@ package body GNAT.Perfect_Hash_Generators is L : Integer; -- First and last words of a subset - begin - Last := 0; + Offset : Natural; + -- GNAT.Heap_Sort assumes that the first array index is 1. Offset + -- defines the translation to operate. - -- For each subset in S, extract the new subsets we have by - -- adding C in the position selection. + function Lt (L, R : Natural) return Boolean; + procedure Move (From : Natural; To : Natural); + -- Subprograms needed by GNAT.Heap_Sort_A - for J in S'Range loop - declare - Offset : Natural; - -- GNAT.Heap_Sort assumes that the first array index - -- is 1. Offset defines the translation to operate. - - procedure Move (From : Natural; To : Natural); - function Lt (L, R : Natural) return Boolean; - -- Subprograms needed by GNAT.Heap_Sort_A - - ---------- - -- Move -- - ---------- - - procedure Move (From : Natural; To : Natural) is - Target, Source : Natural; - - begin - if From = 0 then - Source := 0; - Target := Offset + To; - elsif To = 0 then - Source := Offset + From; - Target := 0; - else - Source := Offset + From; - Target := Offset + To; - end if; + -------- + -- Lt -- + -------- - WT.Table (Reduced (Target)) := WT.Table (Reduced (Source)); - end Move; - - -------- - -- Lt -- - -------- - - function Lt (L, R : Natural) return Boolean is - C : constant Natural := Pos; - Left : Natural; - Right : Natural; - - begin - if L = 0 then - Left := 0; - Right := Offset + R; - elsif R = 0 then - Left := Offset + L; - Right := 0; - else - Left := Offset + L; - Right := Offset + R; - end if; + function Lt (L, R : Natural) return Boolean is + C : constant Natural := Pos; + Left : Natural; + Right : Natural; - return WT.Table (Reduced (Left))(C) - < WT.Table (Reduced (Right))(C); - end Lt; + begin + if L = 0 then + Left := Reduced (0) - 1; + Right := Offset + R; + elsif R = 0 then + Left := Offset + L; + Right := Reduced (0) - 1; + else + Left := Offset + L; + Right := Offset + R; + end if; - -- Start of processing for Build_Identical_Key_Sets + return WT.Table (Left)(C) < WT.Table (Right)(C); + end Lt; - begin - Offset := S (J).First - 1; + ---------- + -- Move -- + ---------- + + procedure Move (From : Natural; To : Natural) is + Target, Source : Natural; + + begin + if From = 0 then + Source := Reduced (0) - 1; + Target := Offset + To; + elsif To = 0 then + Source := Offset + From; + Target := Reduced (0) - 1; + else + Source := Offset + From; + Target := Offset + To; + end if; + + WT.Table (Target) := WT.Table (Source); + end Move; + + -- Start of processing for Build_Identical_Key_Sets + + begin + Last := 0; + + -- For each subset in S, extract the new subsets we have by adding C + -- in the position selection. + + for J in S'Range loop + if S (J).First = S (J).Last then + F := S (J).First; + L := S (J).Last; + Last := Last + 1; + Table (Last) := (F, L); + + else + Offset := Reduced (S (J).First) - 1; Sort (S (J).Last - S (J).First + 1, Move'Unrestricted_Access, Lt'Unrestricted_Access); - F := -1; - L := -1; - for N in S (J).First .. S (J).Last - 1 loop + F := S (J).First; + L := F; + for N in S (J).First .. S (J).Last loop - -- Two contiguous words are identical + -- For the last item, close the last subset - if WT.Table (Reduced (N))(C) = - WT.Table (Reduced (N + 1))(C) - then - -- This is the first word of the subset + if N = S (J).Last then + Last := Last + 1; + Table (Last) := (F, N); - if F = -1 then - F := N; - end if; + -- Two contiguous words are identical when they have the + -- same Cth character. + elsif WT.Table (Reduced (N))(C) = + WT.Table (Reduced (N + 1))(C) + then L := N + 1; - -- This is the last word of the subset + -- Find a new subset of identical keys. Store the current + -- one and create a new subset. - elsif F /= -1 then + else Last := Last + 1; Table (Last) := (F, L); - F := -1; + F := N + 1; + L := F; end if; end loop; - - -- This is the last word of the subset and of the set - - if F /= -1 then - Last := Last + 1; - Table (Last) := (F, L); - end if; - end; + end if; end loop; end Build_Identical_Keys_Sets; -------------------------- - -- Count_Identical_Keys -- + -- Count_Different_Keys -- -------------------------- - function Count_Identical_Keys - (Table : Vertex_Table_Type; - Last : Natural; - Pos : Natural) - return Natural + function Count_Different_Keys + (Table : Vertex_Table_Type; + Last : Natural; + Pos : Natural) return Natural is N : array (Character) of Natural; C : Character; @@ -2125,9 +2122,9 @@ package body GNAT.Perfect_Hash_Generators is begin -- For each subset, count the number of words that are still - -- identical when we include Sel_Position (Last_Sel_Pos) in - -- the position selection. Only focus on this position as the - -- other positions already produce identical keys. + -- different when we include Pos in the position selection. Only + -- focus on this position as the other positions already produce + -- identical keys. for S in 1 .. Last loop @@ -2139,68 +2136,85 @@ package body GNAT.Perfect_Hash_Generators is N (C) := N (C) + 1; end loop; - -- Add to the total when there are two identical keys + -- Update the number of different keys. Each character used + -- denotes a different key. for J in N'Range loop - if N (J) > 1 then - T := T + N (J); + if N (J) > 0 then + T := T + 1; end if; end loop; end loop; return T; - end Count_Identical_Keys; + end Count_Different_Keys; -- Start of processing for Select_Char_Position begin - for C in Sel_Position'Range loop - Sel_Position (C) := C; - end loop; - - -- Initialization of Words - - WT.Set_Last (2 * NK - 1); + -- Initialize the reduced words set + WT.Set_Last (2 * NK); for K in 0 .. NK - 1 loop - WT.Table (Reduced (K) + 1) := WT.Table (Initial (K)); + WT.Table (Reduced (K)) := WT.Table (Initial (K)); end loop; declare - Collisions : Natural; - Min_Collisions : Natural := NK; - Old_Collisions : Natural; - Min_Coll_Sel_Pos : Natural := 0; -- init to kill warning - Min_Coll_Sel_Pos_Idx : Natural := 0; -- init to kill warning + Differences : Natural; + Max_Differences : Natural := 0; + Old_Differences : Natural; + Max_Diff_Sel_Pos : Natural := 0; -- init to kill warning + Max_Diff_Sel_Pos_Idx : Natural := 0; -- init to kill warning Same_Keys_Sets_Table : Vertex_Table_Type (1 .. NK); Same_Keys_Sets_Last : Natural := 1; begin - Same_Keys_Sets_Table (1) := (1, NK); + for C in Sel_Position'Range loop + Sel_Position (C) := C; + end loop; + + Same_Keys_Sets_Table (1) := (0, NK - 1); loop - -- Preserve minimum identical keys and check later on - -- that this value is strictly decrementing. Otherwise, - -- it means that two keys are stricly identical. + -- Preserve maximum number of different keys and check later on + -- that this value is strictly incrementing. Otherwise, it means + -- that two keys are stricly identical. + + Old_Differences := Max_Differences; - Old_Collisions := Min_Collisions; + -- The first position should not exceed the minimum key length. + -- Otherwise, we may end up with an empty word once reduced. - -- Find which position reduces the most of collisions + if Last_Sel_Pos = 0 then + Max_Sel_Pos := Min_Key_Len; + else + Max_Sel_Pos := Max_Key_Len; + end if; - for J in Last_Sel_Pos + 1 .. Sel_Position'Last loop - Collisions := Count_Identical_Keys + -- Find which position increases more the number of differences + + for J in Last_Sel_Pos + 1 .. Max_Sel_Pos loop + Differences := Count_Different_Keys (Same_Keys_Sets_Table, Same_Keys_Sets_Last, Sel_Position (J)); - if Collisions < Min_Collisions then - Min_Collisions := Collisions; - Min_Coll_Sel_Pos := Sel_Position (J); - Min_Coll_Sel_Pos_Idx := J; + if Verbose then + Put (Output, + "Selecting position" & Sel_Position (J)'Img & + " results in" & Differences'Img & + " differences"); + New_Line (Output); + end if; + + if Differences > Max_Differences then + Max_Differences := Differences; + Max_Diff_Sel_Pos := Sel_Position (J); + Max_Diff_Sel_Pos_Idx := J; end if; end loop; - if Old_Collisions = Min_Collisions then + if Old_Differences = Max_Differences then Raise_Exception (Program_Error'Identity, "some keys are identical"); end if; @@ -2208,43 +2222,95 @@ package body GNAT.Perfect_Hash_Generators is -- Insert selected position and sort Sel_Position table Last_Sel_Pos := Last_Sel_Pos + 1; - Sel_Position (Last_Sel_Pos + 1 .. Min_Coll_Sel_Pos_Idx) := - Sel_Position (Last_Sel_Pos .. Min_Coll_Sel_Pos_Idx - 1); - Sel_Position (Last_Sel_Pos) := Min_Coll_Sel_Pos; + Sel_Position (Last_Sel_Pos + 1 .. Max_Diff_Sel_Pos_Idx) := + Sel_Position (Last_Sel_Pos .. Max_Diff_Sel_Pos_Idx - 1); + Sel_Position (Last_Sel_Pos) := Max_Diff_Sel_Pos; for P in 1 .. Last_Sel_Pos - 1 loop - if Min_Coll_Sel_Pos < Sel_Position (P) then + if Max_Diff_Sel_Pos < Sel_Position (P) then Sel_Position (P + 1 .. Last_Sel_Pos) := Sel_Position (P .. Last_Sel_Pos - 1); - Sel_Position (P) := Min_Coll_Sel_Pos; + Sel_Position (P) := Max_Diff_Sel_Pos; exit; end if; end loop; - exit when Min_Collisions = 0; + exit when Max_Differences = NK; Build_Identical_Keys_Sets (Same_Keys_Sets_Table, Same_Keys_Sets_Last, - Min_Coll_Sel_Pos); + Max_Diff_Sel_Pos); + + if Verbose then + Put (Output, + "Selecting position" & Max_Diff_Sel_Pos'Img & + " results in" & Max_Differences'Img & + " differences"); + New_Line (Output); + Put (Output, "--"); + New_Line (Output); + for J in 1 .. Same_Keys_Sets_Last loop + for K in + Same_Keys_Sets_Table (J).First .. + Same_Keys_Sets_Table (J).Last + loop + Put (Output, WT.Table (Reduced (K))); + New_Line (Output); + end loop; + Put (Output, "--"); + New_Line (Output); + end loop; + end if; end loop; end; Char_Pos_Set_Len := Last_Sel_Pos; - Char_Pos_Set := Allocate (Char_Pos_Set_Len, Char_Pos_Size); + Char_Pos_Set := Allocate (Char_Pos_Set_Len); for C in 1 .. Last_Sel_Pos loop Set_Char_Pos (C - 1, Sel_Position (C)); end loop; end Select_Char_Position; + -------------------------- + -- Select_Character_Set -- + -------------------------- + + procedure Select_Character_Set + is + Last : Natural := 0; + Used : array (Character) of Boolean := (others => False); + Char : Character; + + begin + for J in 0 .. NK - 1 loop + for K in 0 .. Char_Pos_Set_Len - 1 loop + Char := WT.Table (Initial (J))(Get_Char_Pos (K)); + exit when Char = ASCII.NUL; + Used (Char) := True; + end loop; + end loop; + + Used_Char_Set_Len := 256; + Used_Char_Set := Allocate (Used_Char_Set_Len); + + for J in Used'Range loop + if Used (J) then + Set_Used_Char (J, Last); + Last := Last + 1; + else + Set_Used_Char (J, 0); + end if; + end loop; + end Select_Character_Set; + ------------------ -- Set_Char_Pos -- ------------------ procedure Set_Char_Pos (P : Natural; Item : Natural) is N : constant Natural := Char_Pos_Set + P; - begin IT.Table (N) := Item; end Set_Char_Pos; @@ -2255,7 +2321,6 @@ package body GNAT.Perfect_Hash_Generators is procedure Set_Edges (F : Natural; Item : Edge_Type) is N : constant Natural := Edges + (F * Edge_Size); - begin IT.Table (N) := Item.X; IT.Table (N + 1) := Item.Y; @@ -2266,44 +2331,36 @@ package body GNAT.Perfect_Hash_Generators is -- Set_Graph -- --------------- - procedure Set_Graph (F : Natural; Item : Integer) is - N : constant Natural := G + (F * Graph_Item_Size); - + procedure Set_Graph (N : Natural; Item : Integer) is begin - IT.Table (N) := Item; + IT.Table (G + N) := Item; end Set_Graph; ------------- -- Set_Key -- ------------- - procedure Set_Key (F : Key_Id; Item : Key_Type) is - N : constant Natural := Keys + F * Key_Size; - + procedure Set_Key (N : Key_Id; Item : Key_Type) is begin - IT.Table (N) := Item.Edge; + IT.Table (Keys + N) := Item.Edge; end Set_Key; - ------------------ - -- Set_Rand_Tab -- - ------------------ - - procedure Set_Rand_Tab (T : Integer; X, Y : Natural; Item : Natural) is - N : constant Natural := - T + ((Y * Rand_Tab_Len_1) + X) * Rand_Tab_Item_Size; + --------------- + -- Set_Table -- + --------------- + procedure Set_Table (T : Integer; X, Y : Natural; Item : Natural) is + N : constant Natural := T + ((Y * T1_Len) + X); begin IT.Table (N) := Item; - end Set_Rand_Tab; + end Set_Table; ------------------- -- Set_Used_Char -- ------------------- procedure Set_Used_Char (C : Character; Item : Natural) is - N : constant Natural := - Used_Char_Set + Character'Pos (C) * Used_Char_Size; - + N : constant Natural := Used_Char_Set + Character'Pos (C); begin IT.Table (N) := Item; end Set_Used_Char; @@ -2314,7 +2371,6 @@ package body GNAT.Perfect_Hash_Generators is procedure Set_Vertices (F : Natural; Item : Vertex_Type) is N : constant Natural := Vertices + (F * Vertex_Size); - begin IT.Table (N) := Item.First; IT.Table (N + 1) := Item.Last; @@ -2327,24 +2383,23 @@ package body GNAT.Perfect_Hash_Generators is function Sum (Word : Word_Type; Table : Table_Id; - Opt : Optimization) - return Natural + Opt : Optimization) return Natural is S : Natural := 0; R : Natural; begin if Opt = CPU_Time then - for J in 0 .. Rand_Tab_Len_1 - 1 loop + for J in 0 .. T1_Len - 1 loop exit when Word (J + 1) = ASCII.NUL; - R := Get_Rand_Tab (Table, J, Get_Used_Char (Word (J + 1))); + R := Get_Table (Table, J, Get_Used_Char (Word (J + 1))); S := (S + R) mod NV; end loop; else - for J in 0 .. Rand_Tab_Len_1 - 1 loop + for J in 0 .. T1_Len - 1 loop exit when Word (J + 1) = ASCII.NUL; - R := Get_Rand_Tab (Table, J, 0); + R := Get_Table (Table, J, 0); S := (S + R * Character'Pos (Word (J + 1))) mod NV; end loop; end if; @@ -2373,9 +2428,8 @@ package body GNAT.Perfect_Hash_Generators is function Value (Name : Table_Name; - J : Natural; - K : Natural := 0) - return Natural + J : Natural; + K : Natural := 0) return Natural is begin case Name is @@ -2386,10 +2440,10 @@ package body GNAT.Perfect_Hash_Generators is return Get_Used_Char (Character'Val (J)); when Function_Table_1 => - return Get_Rand_Tab (T1, J, K); + return Get_Table (T1, J, K); when Function_Table_2 => - return Get_Rand_Tab (T2, J, K); + return Get_Table (T2, J, K); when Graph_Table => return Get_Graph (J); diff --git a/gcc/ada/g-pehage.ads b/gcc/ada/g-pehage.ads index e9f3636..5cff8c5 100644 --- a/gcc/ada/g-pehage.ads +++ b/gcc/ada/g-pehage.ads @@ -6,7 +6,7 @@ -- -- -- S p e c -- -- -- --- Copyright (C) 2002-2004 Ada Core Technologies, Inc. -- +-- Copyright (C) 2002-2005 Ada Core Technologies, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -31,122 +31,133 @@ -- -- ------------------------------------------------------------------------------ --- This package provides a generator of static minimal perfect hash --- functions. To understand what a perfect hash function is, we --- define several notions. These definitions are inspired from the --- following paper: - --- Zbigniew J. Czech, George Havas, and Bohdan S. Majewski ``An --- Optimal Algorithm for Generating Minimal Perfect Hash Functions'', --- Information Processing Letters, 43(1992) pp.257-264, Oct.1992 - --- Let W be a set of m words. A hash function h is a function that --- maps the set of words W into some given interval of integers --- [0, k-1], where k is an integer, usually k >= m. h (w) where w --- is a word computes an address or an integer from I for the --- storage or the retrieval of that item. The storage area used to --- store items is known as a hash table. Words for which the same --- address is computed are called synonyms. Due to the existence --- of synonyms a situation called collision may arise in which two --- items w1 and w2 have the same address. Several schemes for --- resolving known. A perfect hash function is an injection from --- the word set W to the integer interval I with k >= m. If k = m, --- then h is a minimal perfect hash function. A hash function is --- order preserving if it puts entries into the hash table in a --- prespecified order. +-- This package provides a generator of static minimal perfect hash functions. +-- To understand what a perfect hash function is, we define several notions. +-- These definitions are inspired from the following paper: + +-- Zbigniew J. Czech, George Havas, and Bohdan S. Majewski ``An Optimal +-- Algorithm for Generating Minimal Perfect Hash Functions'', Information +-- Processing Letters, 43(1992) pp.257-264, Oct.1992 + +-- Let W be a set of m words. A hash function h is a function that maps the +-- set of words W into some given interval of integers [0, k-1], where k is an +-- integer, usually k >= m. h (w) where is a word computes an address or an +-- integer from I for the storage or the retrieval of that item. The storage +-- area used to store items is known as a hash table. Words for which the same +-- address is computed are called synonyms. Due to the existence of synonyms a +-- situation called collision may arise in which two items w1 and w2 have the +-- same address. Several schemes for resolving known. A perfect hash function +-- is an injection from the word set W to the integer interval I with k >= m. +-- If k = m, then h is a minimal perfect hash function. A hash function is +-- order preserving if it puts entries into the hash table in prespecified +-- order. -- A minimal perfect hash function is defined by two properties: --- Since no collisions occur each item can be retrieved from the --- table in *one* probe. This represents the "perfect" property. +-- Since no collisions occur each item can be retrieved from the table in +-- *one* probe. This represents the "perfect" property. --- The hash table size corresponds to the exact size of W and --- *no larger*. This represents the "minimal" property. +-- The hash table size corresponds to the exact size of W and *no larger*. +-- This represents the "minimal" property. --- The functions generated by this package require the key set to --- be known in advance (they are "static" hash functions). --- The hash functions are also order preservering. If w2 is inserted --- after w1 in the generator, then f (w1) < f (w2). These hashing --- functions are convenient for use with realtime applications. +-- The functions generated by this package require the key set to be known in +-- advance (they are "static" hash functions). The hash functions are also +-- order preservering. If w2 is inserted after w1 in the generator, then (w1) +-- < f (w2). These hashing functions are convenient for use with realtime +-- applications. package GNAT.Perfect_Hash_Generators is Default_K_To_V : constant Float := 2.05; - -- Default ratio for the algorithm. When K is the number of keys, - -- V = (K_To_V) * K is the size of the main table of the hash function. + -- Default ratio for the algorithm. When K is the number of keys, V = + -- (K_To_V) * K is the size of the main table of the hash function. To + -- converge, the algorithm requires K_To_V to be stricly greater than 2.0. Default_Pkg_Name : constant String := "Perfect_Hash"; - -- Default package name in which the hash function is defined. + -- Default package name in which the hash function is defined Default_Position : constant String := ""; - -- The generator allows selection of the character positions used - -- in the hash function. By default, all positions are selected. + -- The generator allows selection of the character positions used in the + -- hash function. By default, all positions are selected. + + Default_Tries : constant Positive := 20; + -- This algorithm may not succeed to find a possible mapping on the first + -- try and may have to iterate a number of times. This constant bounds the + -- number of tries. type Optimization is (Memory_Space, CPU_Time); Default_Optimization : constant Optimization := CPU_Time; - -- Optimize either the memory space or the execution time. + -- Optimize either the memory space or the execution time Verbose : Boolean := False; - -- Comment required ??? + -- Output the status of the algorithm. For instance, the tables, the random + -- graph (edges, vertices) and selected char positions are output between + -- two iterations. procedure Initialize (Seed : Natural; K_To_V : Float := Default_K_To_V; - Optim : Optimization := CPU_Time); - -- Initialize the generator and its internal structures. Set the - -- ratio of vertices over keys in the random graphs. This value - -- has to be greater than 2.0 in order for the algorithm to succeed. + Optim : Optimization := CPU_Time; + Tries : Positive := Default_Tries); + -- Initialize the generator and its internal structures. Set the ratio of + -- vertices over keys in the random graphs. This value has to be greater + -- than 2.0 in order for the algorithm to succeed. The key set is not + -- modified (in particular when it is already set). For instance, it is + -- possible to run several times the generator with different settings on + -- the same key set. procedure Finalize; - -- Deallocate the internal structures. + -- Deallocate the internal structures and the key table procedure Insert (Value : String); - -- Insert a new key in the table. + -- Insert a new key in the table + + Too_Many_Tries : exception; + -- Raised after Tries unsuccessfull runs procedure Compute (Position : String := Default_Position); - -- Compute the hash function. Position allows to define a - -- selection of character positions used in the keywords hash - -- function. Positions can be separated by commas and range like - -- x-y may be used. Character '$' represents the final character - -- of a key. With an empty position, the generator automatically - -- produces positions to reduce the memory usage. + -- Compute the hash function. Position allows to define selection of + -- character positions used in the keywords hash function. Positions can be + -- separated by commas and range like x-y may be used. Character '$' + -- represents the final character of a key. With an empty position, the + -- generator automatically produces positions to reduce the memory usage. + -- Raise Too_Many_Tries in case that the algorithm does not succeed in less + -- than Tries attempts (see Initialize). procedure Produce (Pkg_Name : String := Default_Pkg_Name); - -- Generate the hash function package Pkg_Name. This package - -- includes the minimal perfect Hash function. + -- Generate the hash function package Pkg_Name. This package includes the + -- minimal perfect Hash function. - -- The routines and structures defined below allow producing the - -- hash function using a different way from the procedure above. - -- The procedure Define returns the lengths of an internal table - -- and its item type size. The function Value returns the value of - -- each item in the table. + -- The routines and structures defined below allow producing the hash + -- function using a different way from the procedure above. The procedure + -- Define returns the lengths of an internal table and its item type size. + -- The function Value returns the value of each item in the table. -- The hash function has the following form: -- h (w) = (g (f1 (w)) + g (f2 (w))) mod m - -- G is a function based on a graph table [0,n-1] -> [0,m-1]. m is - -- the number of keys. n is an internally computed value and it - -- can be obtained as the length of vector G. + -- G is a function based on a graph table [0,n-1] -> [0,m-1]. m is the + -- number of keys. n is an internally computed value and it can be obtained + -- as the length of vector G. - -- F1 and F2 are two functions based on two function tables T1 and - -- T2. Their definition depends on the chosen optimization mode. + -- F1 and F2 are two functions based on two function tables T1 and T2. + -- Their definition depends on the chosen optimization mode. - -- Only some character positions are used in the keys because they - -- are significant. They are listed in a character position table - -- (P in the pseudo-code below). For instance, in {"jan", "feb", - -- "mar", "apr", "jun", "jul", "aug", "sep", "oct", "nov", "dec"}, - -- only positions 2 and 3 are significant (the first character can - -- be ignored). In this example, P = {2, 3} + -- Only some character positions are used in the keys because they are + -- significant. They are listed in a character position table (P in the + -- pseudo-code below). For instance, in {"jan", "feb", "mar", "apr", "jun", + -- "jul", "aug", "sep", "oct", "nov", "dec"}, only positions 2 and 3 are + -- significant (the first character can be ignored). In this example, P = + -- {2, 3} -- When Optimization is CPU_Time, the first dimension of T1 and T2 - -- corresponds to the character position in the key and the second - -- to the character set. As all the character set is not used, we - -- define a used character table which associates a distinct index - -- to each used character (unused characters are mapped to - -- zero). In this case, the second dimension of T1 and T2 is - -- reduced to the used character set (C in the pseudo-code - -- below). Therefore, the hash function has the following: + -- corresponds to the character position in the key and the second to the + -- character set. As all the character set is not used, we define a used + -- character table which associates a distinct index to each used character + -- (unused characters are mapped to zero). In this case, the second + -- dimension of T1 and T2 is reduced to the used character set (C in the + -- pseudo-code below). Therefore, the hash function has the following: -- function Hash (S : String) return Natural is -- F : constant Natural := S'First - 1; @@ -165,11 +176,11 @@ package GNAT.Perfect_Hash_Generators is -- return (Natural (G (F1)) + Natural (G (F2))) mod ; -- end Hash; - -- When Optimization is Memory_Space, the first dimension of T1 - -- and T2 corresponds to the character position in the key and the - -- second dimension is ignored. T1 and T2 are no longer matrices - -- but vectors. Therefore, the used character table is not - -- available. The hash function has the following form: + -- When Optimization is Memory_Space, the first dimension of T1 and T2 + -- corresponds to the character position in the key and the second + -- dimension is ignored. T1 and T2 are no longer matrices but vectors. + -- Therefore, the used character table is not available. The hash function + -- has the following form: -- function Hash (S : String) return Natural is -- F : constant Natural := S'First - 1; @@ -200,17 +211,16 @@ package GNAT.Perfect_Hash_Generators is Item_Size : out Natural; Length_1 : out Natural; Length_2 : out Natural); - -- Return the definition of the table Name. This includes the - -- length of dimensions 1 and 2 and the size of an unsigned - -- integer item. When Length_2 is zero, the table has only one - -- dimension. All the ranges start from zero. + -- Return the definition of the table Name. This includes the length of + -- dimensions 1 and 2 and the size of an unsigned integer item. When + -- Length_2 is zero, the table has only one dimension. All the ranges start + -- from zero. function Value (Name : Table_Name; J : Natural; - K : Natural := 0) - return Natural; - -- Return the value of the component (I, J) of the table - -- Name. When the table has only one dimension, J is ignored. + K : Natural := 0) return Natural; + -- Return the value of the component (I, J) of the table Name. When the + -- table has only one dimension, J is ignored. end GNAT.Perfect_Hash_Generators; -- 2.7.4