Messages from 67425

Article: 67425
Subject: Re: licence for Xilinx 2.1i
From: Larry Doolittle <ldoolitt@recycle.lbl.gov>
Date: Thu, 11 Mar 2004 17:54:55 +0000 (UTC)
Links: << >> << T >> << A >>

In article <4050A1A4.5040206@no_xilinx_spam.com>, Brian Philofsky wrote:
>> Can the webpack tools be used to synthesize the design to an XNF or EDIF
>> file to use in the ISE 4.1 tool?  
> 
> No, not for the XC4000 families.  That is why there is no synthesis tool 
> in the Classic version of the software.  XST can only target 
> Virtex/Spartan-II and later architectures and was never made to work 
> with the older devices.

Icarus Verilog (GPL, http://icarus.com/eda/verilog/) can synthesize
for a XC4000 target via XNF.  It's synthesis subset is relatively limited,
but it can be made to work.  Tested (a long time ago) with ISE 2.1.

      - Larry

Article: 67426
Subject: Re: 300MHz spartan3 cpu update , and Webpack6.2 shocker
From: rickman <spamgoeshere4@yahoo.com>
Date: Thu, 11 Mar 2004 12:56:23 -0500
Links: << >> << T >> << A >>

john jakson wrote:
> 
> jon@beniston.com (Jon Beniston) wrote in message news:<e87b9ce8.0403110225.313772a3@posting.google.com>...
> > > I also upgraded to Webpack 6.2 and got quite a shock. For the sp3, the
> > > fmax has shot up to 430MHz overall and the 2 larger blocks besides
> > > blockram are in the 550MHz ballpark.
> >
> > If true, then Wow! Your FPGA based CPU is faster than 0.13 ASIC CPUs.
> >
> 
> Only because of HT and 3 level of LUT logic. HT means the pipelines
> are mostly independant but cyclic like the spokes of a wheel. Just
> like DSP (which is what I have been doing alot of over the years since
> my Inmos days).
> 
> >  The timing report is essentially
> > > same as before, no special warnings but some improvement in compile
> > > time I was looking for.
> >
> > Is that post-synthesis or post-layout?
> >
> 
> Post synth.

OH!!!  My experience is that the synth tools do a lousy job of
estimating total delay.  You can expect your routed delays to be twice
that or more.  If you are seeing 400 MHz post synth, you will be lucky
to get 200 MHz after routing.  If you don't floorplan well, you may see
<100 MHz.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 67427
Subject: Re: Dual-stack (Forth) processors
From: rickman <spamgoeshere4@yahoo.com>
Date: Thu, 11 Mar 2004 13:02:05 -0500
Links: << >> << T >> << A >>

Brad Eckert wrote:
> 
> rickman <spamgoeshere4@yahoo.com> wrote in message news:<404FDFE5.4961E719@yahoo.com>...
> > I am aware
> > that you market an HDL version of an older Forth processor, but I expect
> > that is not well optimized for FPGA implementation.
> >
> From what I've seen on the MPE web site, the RTX2000 core runs
> reasonably fast in an FPGA. About the same speed as my processor. The
> stacks use block RAM, so I would expect it to fit reasonably well in
> an FPGA. Maybe Stephen can tell us how many Spartan II slices it uses.
> 
> The RTX2000 is an old design, but newer isn't necessarily better. It
> was designed back when hardware was expensive and it turned out very
> well.

I have looked at a lot of CPU implementations for FPGAs over the last
week or two and my observation is that an FPGA version of an ASIC CPU
tends to be significantly larger than a CPU designed just for an FPGA. 
I am not so concerned about speed, more the size.  The ARM I am using is
only 60 MHz and I would expect that out of any decent CPU core in an
FPGA.  But I would like to see it use under 600 LCs/LEs plus block RAM. 
I seem to recall seeing some that were as small as 400 LEs.  

But the big issue is having a decent development environment for
software.  Any FPGA designer can make a CPU... even me :)

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 67428
Subject: Re: difference btw H/W & S/W implementations !!
From: hrubin@odds.stat.purdue.edu (Herman Rubin)
Date: 11 Mar 2004 14:34:54 -0500
Links: << >> << T >> << A >>

In article <zmU3c.38500$6y1.1282676@news20.bellglobal.com>,
Invisible One <Invisible_1@sympatico.ca> wrote:
>*sigh* - is speed the thing that everyone is concerned with?

>Hands down, hardware is faster than software - most non-technical folks know
>this.  End of story.  Now if you are looking at complex algorithm
>implementations algorithm COMPLEXITY is of paramount concern.  Time
>complexity is independent of platform: sw/hw.

What is complexity?  This is not a vacuous question; the 
computational complexity of generating random variables 
with "most" distributions from uniform random input by
the usual methods goes up rapidly with the length, but
it one is allowed to use bit methods, it is random finite,
usually with finite expectation.  However, it can be so
slow as not even to be considered.  This is the case even
if no roundoff is allowed, other than restricting the 
number of bits output.
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558

Article: 67429
Subject: Re: Answering Machine RAM
From: wv9557@yahoo.com (Will)
Date: 11 Mar 2004 13:04:18 -0800
Links: << >> << T >> << A >>

"Kevin Neilson" <kevin_neilson@removethiscomcast.net> wrote in message news:<BJQ3c.1317$C51.24342@attbi_s52>...
> I know; this is off-topic:  I can't figure out why, when there is a brief
> power outage, my answering machine loses time information but retains voice
> messages.  I guess the messages must be stored in FLASH.  Is it just too
> much of a pain to write the time into FLASH as well?  If it can't be written
> into FLASH, couldn't it use a small cap to back up the volatile RAM?  Maybe
> I shouldn't expect too much for $25.
> -Kevin

I suppose they could have revert to battery backup, but didn't bother.
Will

Article: 67430
Subject: where to start for going high bandwidth [was: next learning platform]
From: myren <thefowle@wam.umd.edu>
Date: Thu, 11 Mar 2004 17:31:29 -0500
Links: << >> << T >> << A >>

lets preface this with what i'm looking for: student budget route for 
going high bandwidth signal processing on a portable (presumably thus 
microcontroller controlled) platform.  there's a <<backgrounds>> at the 
bottom with more on where i'm coming from: learned micros, want to move 
on.  i want to move from embedded systems as a chip running a bunch of 
simple IC's and passives to a bunch of active chips working with each other.

there's a bunch of codecs already out there to do 8 channel 192khz 24 
bit audio.  ~4Mbytes a second / raw data which eventually needs to end 
up on a computer (ethernet, usb2, firewire, whatever).  it seems clear 
that conventional bit banging micros is just not an options.  thats 
fine, i want to learn more than that anyways.  some sort of DMA system 
is the obvious path.  this goes where i want to go: towards multi chip 
systems which are using DMA systems.

Q1: perhaps i should start with fpga's?  opencores.org has a million 
things you can put on a wishbone bus.  seems awsome, sure it'll be hard 
as hell, but fun.  havent found many gate counts, not sure how high end 
i need to go in the fpga world.

Q2: do you normally replace microcontrollers with fpga's?  or can you 
still have your micro running control and your fpga doing more high 
bandwidth shuffle data around / do basic fixed processing things?  it 
seems like having your entire core on an fpga is kind of excessive, not 
to mention tricky to get it started up right.  and there seem to be a 
strong advantages to having a micro as an all star interrupt vector 
processor / event handler sort of unit.

Q3: what micro should i use for this?  i'm looking for a decent college 
student platform to begin learning cross device integration.  i need a 
new micro: there's no way i'm doing this all in assembly on my 
Scenix/Ubicom's (or basic for that matter ;).  8051's seem pretty 
popular.  ARM is nice for its high speeds.  Will most in circuit 
debuggers / programmers for a given platform work with a variety of 
chips, or are you vendor locked?  Going with a system where I can forgoe 
the demo/reference board and just get a programmer/debugger would be a 
good cost saving end.

<< background >>
i thought about this a lot last night and realized i should be more 
clear in what i'm looking for.  i finished signals and systems last 
semester and it blew my mind.  suddenly i have this burning passion to 
start doing more high bandwidth processing.  i'm starting to get more 
interested in amateur radio, audio processing, general audio at large.

on a more philosophical level, my interest in computer engineering has 
stemmed much from my interest in legos: the ability to piece together 
blocks to form something with shape and purpose.  i feel like, having 
"mastered" the scenix/ubicom 75 MIPS pic clone, i've more or less passed 
the stand alone micro phase, and want to begin working on adding more 
building blocks to my repitour.

originally i thought going linux would be nice, but uCLinux is a joke 
(no multithreading, cant compile most linux apps... whats the point).


Thanks
Myren

Article: 67431
Subject: Re: Does iseWebPack 6.2w has FPGA-Editor inside?
From: Steve Sharp <sharp@xilinx.com>
Date: Thu, 11 Mar 2004 15:59:11 -0800
Links: << >> << T >> << A >>

Kelvin,

WebPACK v6.2i does not have FPGA Editor. That functionality is in the
ISE Foundation or ISE Alliance versions only.

Regards,
Steve Sharp
Xilinx Product Solutions Marketing

Kelvin wrote:

> Does Webpack 6.2w has any FPGA Editor functionality inside?

Article: 67432
Subject: Re: 300MHz spartan3 cpu update , and Webpack6.2 shocker
From: johnjakson@yahoo.com (john jakson)
Date: 11 Mar 2004 16:00:13 -0800
Links: << >> << T >> << A >>

johnjakson@yahoo.com (john jakson) wrote in message news:<adb3971c.0403101708.57d378fb@posting.google.com>...
> A while back I mentioned I'd post an update on progress for this R3
> hyperthreaded cpu project.
> 

post note.
In light of the 6.2 result which I don't believe either, I went back
and did a synth for a registered adder ins & outs of var widths, 4b at
a time for some of the families.


// 6.2 -------------------------  |  6.1  | 6.13  | est
// add sp2  v2  v2 v2p  sp3  sp3  |  sp3  |  sp3  |  v2
// wid  -6  -5  -6  -7   -4   -5  |   -4  |   -5  | -5.5
//                                |       |       |
//  4  220 403 454 541	1371 1371 |  303  |  433  | 429
//  8  212 303 359 404            |  238  |  330  | 331
// 12  205 287 339 382  4166 4166 |  219  |>>311<<| 313
// 16  198 273 320 361            |  203  |  294  | 296
// 20  191 260 304 343            |  189  |  279  | 282
// 24  185 248 289 327            |  177  |  266  | 268
// 28  180 237 276 312            |  166  |  253  | 256  
// 32  175 227 264 298  4166 4166 |  157  |  242  | 245

On 6.2 the sp3 result is nonsense so I did't complete.

I reinstalled, 6.1 to fill in the gap then added the service pack &
speed file. I added an est v2 -5.5 col to see comparison. The results
look ok to me, I will leave 6.2 alone till its fixed for sp3.

The 311 is highlighted because thats my crit path.

You can't get much simpler than this. I suspect some sort of overflow
error or missing something. I even clean installed 6.2 twice but no
difference.



`define wid 4

module Add32(
  input ck, input [`wid-1:0] a,b, output [`wid-1:0] o);
	
  reg [`wid-1:0] ra,rb,rz;
  assign o = rz;

  always @(posedge ck)
  begin
    rz <= ra+rb; ra <= a; rb <= b;
  end
endmodule

regards

johnjakson_usa_com

Article: 67433
Subject: Xilinx RAMB16_Sm_Sn timing diagram
From: ianwyb@yahoo.com (ian)
Date: 11 Mar 2004 16:22:34 -0800
Links: << >> << T >> << A >>

Hi, Folks, 

Where can I find a Xilinx RAMB16_Sm_Sn DPRAM's timing diagram?

I searched the web and got nothing.

Thanks in advance

Ian

Article: 67434
Subject: Re: what exactly means fanout ?
From: Ray Andraka <ray@andraka.com>
Date: Thu, 11 Mar 2004 19:41:38 -0500
Links: << >> << T >> << A >>

fan-in is the number of signals feeding into a function for a
signal.  For example, a 2 input nand has a fan-in of 2.  Fan-out is
the number of loads driven by a signal.  In FPGAs fanout is not so
much a concern for loading as it is for delay caused by having to
route the signal to so many destinations (because FPGA signals
generally get buffered in the routing).  The longer the net, the
more the delay, and when you have a high fanout you are more or less
guaranteed to have a long net.

chris wrote:

> Fanout/fanin are terms that appear everywhere in the FPGA
> literatures but I can not really find out what it exactly means.
> Is someone can enlighten me on this subject ?
> Thanks.
> Christophe.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 67435
Subject: Re: 300MHz spartan3 cpu update , and Webpack6.2 shocker
From: johnjakson@yahoo.com (john jakson)
Date: 11 Mar 2004 16:43:37 -0800
Links: << >> << T >> << A >>

Marius Vollmer <marius.vollmer@uni-dortmund.de> wrote in message news:<ljsmgfh04i.fsf@troy.dt.e-technik.uni-dortmund.de>...
> johnjakson@yahoo.com (john jakson) writes:
> 
> > A couple of slower Transputers should easily be able to beat 1
> > faster x86 at any task.
> 
> Hmmm, not at _any_ task, I'd say.  Some tasks might not be
> parallelizable enough.  And even if they are, you still need people to
> write these parallel programs, which is very much harder than writing
> sequential programs.
>

Overstating things a bit I agree. The reason I am prone to beat up on
x86 is because I routinely see poor performance from C code that beats
up the cache as well as the miserable W2k performance in general
inspite of my xp2400. Thats more to do with MS though as BeOS shows
the x86 in a good light for even slow cpus. I have to wonder though
what MS & x86 can do with trillions of cpu cycles that don't do any
useful work at W2K boot up when the same no of Vax cycles would have
served a group of 50 people for a few days.

If you go back to when the Transputer 1st came out, it did indeed whop
the 386 at the time 20MHz or so on 20MHz. But it never had much of a
successor. I know the pentium 100 is about 30x slower than the P4, so
I would guess the 386 was another 10x slower than p100. So Intel has
given us 300x perf increase in 25yrs (approx). Being familiar with the
Transputer, I can much more accurately predict that a 300MHz R3 will
whop the old 25MHz part by atleast 50x too (a far simpler comparison
for sure, its RISC v stack architecture. That leave R3 out by a factor
of 6 or so, which is in line with the freq difference too. But N
transputers is something I can use just like a bigger FPGA LUT count,
a bucket full of AMD64 serves no purpose after the 1st without great
expense in system.

The real deal comes with 2nd level memory/cache, I am designing for
the RLDRAM which can cycle in 20ns & 8 ways, that is about 6 cpu
cycles well within the commutation time for a process to fall from
inner 4 to outer 12 and back in again. That compares with x86 being
hundreds faster than its usual DRAM and that is many times slower than
RLDRAM but quite a bit cheaper.

> I don't see anyone rewriting their HDK synthesis tools, for example,
> to take advantage of massive fine-grain parallelism to speed them up.
> Heck, they don't even take advantage of the widely existing dual-CPU
> SMP, right?
>

True, this exact same discusion is endless on comp.arch, but Intel may
have seen the end of ever faster seq computing on the wall. Certainly
BlueGene and others will all go the same way, massively parallel
slower cooler cheaper cpus.

The issue isn't whether current single threaded apps will have to be
rewritten, its a question of new markets for new apps not possible
before at some mips/$ level. The Transputer did it before, I thinks it
doable again.

The further issue after the Transputer core is done is that it will
include the scheduler needed to support event driven HDL simulation
for a Verilog subset. That in turn means that code can be written in
C-Occam or lite Verilog. The compiler for that is at the mid way stage
but needs the cpu to have been completed before the target code can be
emited. If you can run lite HDL on a cpu, you can also synth it to HW
for more speed, just a way of looking at HW & SW interchangeably for
some domains.

> But, your project sounds mighty cool.  By all means, do it, and
> release it with the GPL! :-)

I will keep going, nothing much better to do in NE, but GPL I am not
sure of.

johnjakson_usa_com

Article: 67436
Subject: Re: 300MHz spartan3 cpu update , and Webpack6.2 shocker
From: johnjakson@yahoo.com (john jakson)
Date: 11 Mar 2004 16:50:44 -0800
Links: << >> << T >> << A >>

> I will say your goals are optimistic at least.  If you are doing this as
> a hobby, then fine.  But if you are serious about marketing a CPU or
> core, then you have a long road ahead of you.  But then maybe you
> realize that and are up to the task.  

I am well aware of that, I have had 3yrs to work on the background
stuff and compiler, this latest cpu part is just moving along more
quickly than expected.

> 
> One thing I don't quite get about the idea of everyone rolling their own
> CPU is the effort required.  Clearly the automotive companies have a
> high enough volume to justify designing their own CPUs and SOCs.  But
> mostly they just go to a chip company and ask, "what can you build for
> us?"  This is because there is a lot of work involved in doing a new
> architecture.  So why would other companies want to make such a large
> investment in a custom CPU design even if they don't have to build an
> ASIC to get it built? 
> 

For a classic roll yer own single threaded cpu, there won't be much
incentive to switch anyone, I agree. Its the Transputer part that is
compelling to those who know it, it means nothing to those that never
used it esp in the US. I can live with that.


> I don't want to doubt that you are seeing the results you claim.  But
> Xilinx has publicly said that the Sp3 is not as fast as the VII/VIIpro. 
> So unless you can get the same or better results targeting the Virtex
> family, I suspect your results are anomolous.  
> 

See other post, the 6.2 release for sp3 is odd, you can duplicate it,
I went back to 6.13. 300MHz is fine for now.


> But keep us informed, this is very interesting.  
> 
> -- 
> 
> Rick "rickman" Collins
> 
> rick.collins@XYarius.com
> Ignore the reply address. To email me use the above address with the XY
> removed.
> 
> Arius - A Signal Processing Solutions Company
> Specializing in DSP and FPGA design      URL http://www.arius.com
> 4 King Ave                               301-682-7772 Voice
> Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 67437
Subject: Re: CORDIC vs. LUT
From: Ray Andraka <ray@andraka.com>
Date: Thu, 11 Mar 2004 19:57:32 -0500
Links: << >> << T >> << A >>

Depends on the precision you need.  CORDIC can get you to any precision
desired.  LUTs grow exponentially with the phase resolution, so for a small
number of phase angles, the LUT work well.  With the RAMB16's in V2, you have
14 bits of address in a x1 configuration, which gives you 16 bits of phase
resolution.  To take advantage of that, you'd need it 16 or more bits wide
which means 16 BRAMs.  You could use dual port BRAMs to get sin and cos
simultaneously, then you need multipliers to do the multiply.  THis is roughly
equivalent to a 14 iteration CORDIC.  You also get a little bit more noise with
a LUT plus multiply of a give width because of the double quantization (phase
and then the multiply).  With CORDIC you get an exact rotation, but the
rotation angles are limited by the number of iterations.  Also, if you need
more than about 14 bits phase, then you have to start using more than one BRAM
per bit.  It gets expensive in terms of BRAMs.  That also does not address the
registers you'll need on the BRAM outputs in order to get the performance out
of them.  I frequently use CORDIC rotators with as much as 30 iterations.

Kevin Neilson wrote:

> I've been reading about CORDIC engines (in a paper by Andraka) and I was
> wondering if they are still relevant in parts with a lot of blockRAM.  I've
> always used sin, cos, and arctan lookup tables in blockRAM, which yield two
> results per blockRAM per cycle in a Xilinx V2 (each 18 bits with 88
> millidegree phase resolution), and I was wondering if there is a compelling
> advantage of a CLB-based CORDIC engine in, for example, a V2 part.  I know
> the CORDIC can multiply as well, but the V2 parts also have embedded
> multipliers.  Perhaps those big ROMs and multipliers are making me lazy.
> -Kevin

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 67438
Subject: Re: Release asynchrounous resets synchronously
From: Ray Andraka <ray@andraka.com>
Date: Thu, 11 Mar 2004 20:01:22 -0500
Links: << >> << T >> << A >>

It is just another tool in the box.  Yes, there are plenty of places that would get you into
heaps of trouble.  The environment is often totally within the FPGA, in which case you have
some manner of control over the environment.  It is up to the designer to make sure that his
design is robust enough for the environment it is designed to work in.

rickman wrote:

> Except that if you are in an invalid state, you have no assurance that
> the terminal state will ever be reached.  FSMs are not counters.  They
> interact with the environment and illegal states can get them
> deadlocked.
>
> Ray Andraka wrote:
> >
> > You can do that by having the terminal state assert a synchronous reset to the entire
> > state machine with considerably less logic.
> >
> > Jim Granville wrote:
> >
> > > Ray Andraka wrote:
> > >
> > > > Peter,
> > > >
> > > > Consider the case where you have 1 input high to one of your luts, and 2 inputs
> > > > high to another lut.  It is clearly an illegal state, as it has two extra '1' bits,
> > > > but it will be detected as OK by your circuit because exactly one LUT is indicating
> > > > one input on.  Each group of four requires two outputs to distinguish 0,1 or more
> > > > than 1 input on.
> > >
> > >   I think Peter was partly correct. You can protect/correct a 16 stage
> > > One-Hot engine against illegal states with 5 LUTs, but it will not
> > > recover in a single clock cycle.
> > >   Simplest topology is to have 15 shifters, and #16 loads a HI ONLY if
> > > all Prev15 are 000000000000000, if not, it simply waits until
> > > the bogus ones ripple out.
> > >
> > >   -jg
> > >
> > > >
> > > > Peter Alfke wrote:
> > > >
> > > >
> > > >>Eh, what? Unfortunately anonymos...
> > > >>
> > > >>Each first level LUTs detects (output High) that exactly one of its inputs
> > > >>is High.
> > > >>The second tier LUT detects that exactly one of the first-tier LUT outputs
> > > >>is high, which mans that there is exactly one High input.
> > > >>Agreed ?
> > > >>Peter Alfke
> > > >>
> > > >>
> > > >>>From: user@domain.invalid
> > > >>>Newsgroups: comp.arch.fpga
> > > >>>Date: Tue, 09 Mar 2004 05:24:26 GMT
> > > >>>Subject: Re: Release asynchrounous resets synchronously
> > > >>>
> > > >>>Peter Alfke wrote:
> > > >>>
> > > >>>>LUTs are very efficient "illegal state" detectors.
> > > >>>>Let's say you have a 16-state one-hot machine. Four LUTs can each detect
> > > >>>>"exactly one of my inputs is High", and a fifth LUT does the same with the
> > > >>>>four LUT outputs. So 5 LUTs can detect any illegitimate 16-bit code. Take it
> > > >>>>from there...
> > > >>>
> > > >>>Eh, what?  So the first tier LUT compute f, where
> > > >>>f(a,b,c,d) = 1 iff a+b+c+d = 1, else 0.
> > > >>>For the 5th LUT we have the same property that a legal 16-state would
> > > >>>map exactly one of the four first tier LUTs to 1, thus it sounds like
> > > >>>what you have in mind is something like this:
> > > >>>
> > > >>>f({f(s[3:0]), f(s[7:4]), f(s[11:8]), f(s[15:12])})
> > > >>>
> > > >>>but this could accept states like 16'b1111_1110_1100_0001.
> > > >>>
> > > >>>I don't see how you can detect legal states with only five four-input LUTs.
> > > >>>
> > > >>>
> > > >>>Peter, the FPGA reset question has come many times.  What does Xilinx
> > > >>>recommend in general?  Async-reset+Sync-release, all-sync, or all-async?
> > > >>>Which uses fewest resources?
> > > >>>
> > > >>>Thanks,
> > > >>>
> > > >>>Tommy
> > > >>>
> > > >
> > > >
> > > > --
> > > > --Ray Andraka, P.E.
> > > > President, the Andraka Consulting Group, Inc.
> > > > 401/884-7930     Fax 401/884-7950
> > > > email ray@andraka.com
> > > > http://www.andraka.com
> > > >
> > > >  "They that give up essential liberty to obtain a little
> > > >   temporary safety deserve neither liberty nor safety."
> > > >                                           -Benjamin Franklin, 1759
> > > >
> > > >
> >
> > --
> > --Ray Andraka, P.E.
> > President, the Andraka Consulting Group, Inc.
> > 401/884-7930     Fax 401/884-7950
> > email ray@andraka.com
> > http://www.andraka.com
> >
> >  "They that give up essential liberty to obtain a little
> >   temporary safety deserve neither liberty nor safety."
> >                                           -Benjamin Franklin, 1759
>
> --
>
> Rick "rickman" Collins
>
> rick.collins@XYarius.com
> Ignore the reply address. To email me use the above address with the XY
> removed.
>
> Arius - A Signal Processing Solutions Company
> Specializing in DSP and FPGA design      URL http://www.arius.com
> 4 King Ave                               301-682-7772 Voice
> Frederick, MD 21701-3110                 301-682-7666 FAX

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 67439
Subject: Re: 300MHz spartan3 cpu update , and Webpack6.2 shocker
From: johnjakson@yahoo.com (john jakson)
Date: 11 Mar 2004 17:08:28 -0800
Links: << >> << T >> << A >>

Hi Nick:

> Actually, thats not hyperthreading/SMT, thats
> interleaved-multithreading or C-slowing.  See Chapter 11 and Appendix
> B:

Call it what you will, thanks for paper anyways. I prefer to use Inmos
terminology for all things related to Transputing, but they never had
context swaps down to 0 cycles, only at 20cycles when hit points
arrived. At the outer level it will be more similar to Transputer ie
<<20cycles to do a real context switch involving link lists. The 0
cycle swapping isn't really any different from the 20cycle, just more
often & faster a natural consequence of commutating the HW. Inmos
always said that a process can be 1 op or many so I could call it 1
cycle Process swapping.

The word thread is not the right word for sure as I prefer to use the
process word but that upsets the big OS word police too. I will leave
the argument for another day.

Right now few folks have ever seen these sorts of cpus, Intel HT is
kind of a sore point to users, people want to turn it off, but what if
you can't, what happens when HT is upto 16 or so. Anyway the x86
architecture with HT or without makes no sense for a Transputer head.


> 
> http://www.cs.berkeley.edu/~nweaver/nweaver_thesis.pdf
> 
> It is a big win in FPGA-based CPUs if you can't get your forwarding
> path small enough.


Absolutely true. Without it, having to deal with all the hazards would
wreck the performance and up the design effort enormously. A previous
design was single threaded and had the forwarding logic, but had too
many other complexities to deal with in FPGA to be fast or
completable.

johnjakson_usa_com

Article: 67440
Subject: Re: 300MHz spartan3 cpu update , and Webpack6.2 shocker
From: johnjakson@yahoo.com (john jakson)
Date: 11 Mar 2004 17:21:46 -0800
Links: << >> << T >> << A >>

"Tim" <tim@rockylogic.com.nooospam.com> wrote in message news:<c2q7m9$bq3$1$830fa17d@news.demon.co.uk>...
> john jakson wrote:
> >                          Also C  allows me to create complex logic
> > in C for getting something to work 1st, when it does, turn it back
> > into HDl RTL form. Worked well for me for 20yrs.
> 
> Or do it in Pascal/Delphi and you are just a hop and a skip from
> VHDL.  And if _that_ isn't a controversial sentence....

Yes yes precisely. 

Actually in my Inmos days I was an ADA (steelman.... student), even
thought the iAPX432 was a serious effort, but bit length ops was
overdoing it so Inmos went byte encoded like big brother.

Actually VHDL & ADA make alot of sense, just that I don't like big
languages if I can help it. ADA and VHDL have too many ways of doing
the same thing. Verilog is going the same way. ADA already has what
Occam has (rendezvous etc) but not the conciseness. I haven't really
felt the urge to read VHDL yet, I will watch what Art says.

But seriously C cycle code works pretty well for 1 clock domains where
there is a 1 to 1 for the HDL and C model. Today I just pushed the
model through 1M cycles of rnd adds, a few secs on an xp, but I wonder
what that would be in Verilog. As a Verilog bigot I could imagine VHDL
taking longer than the age of the universe or has it cought up with
Verilog. (ModelSim should be same I know)

Time to get NWirth on board but then it will definitely look like
Pascal as did Lola and so on.

johnjakson_usa_com

Article: 67441
Subject: Re: 300MHz spartan3 cpu update , and Webpack6.2 shocker
From: johnjakson@yahoo.com (john jakson)
Date: 11 Mar 2004 17:27:29 -0800
Links: << >> << T >> << A >>

rickman <spamgoeshere4@yahoo.com> wrote in message 

> > 
> > Post synth.
> 
> OH!!!  My experience is that the synth tools do a lousy job of
> estimating total delay.  You can expect your routed delays to be twice
> that or more.  If you are seeing 400 MHz post synth, you will be lucky
> to get 200 MHz after routing.  If you don't floorplan well, you may see
> <100 MHz.  
> 
> -- 
> 

I will definitely floor plan, I am old hand at it from VLSI days, but
while the logic is unstable doesn't make much sense to spend time on
it. But it certainly cuts the route lengths way down and puts all
related LUTs and related FFs together, and the pics even look sorta
nice. I still have to write some auto magic ucf writer code to speed
up the edit sessions.

johnjakson_usa_com

Article: 67442
Subject: Re: Oftenly used hardware algorithm for RC4 encryption?
From: "Kelvin @ SG" <kelvin8157@hotmail.com>
Date: Fri, 12 Mar 2004 09:29:38 +0800
Links: << >> << T >> << A >>

well, more specific...
what is the oftenly used implementation method for RC4 encryption? It seems
to me
the original 20 lines of C codes is not so suitable for hardware
implementation...

Kelvin







Max <mtj2@btopenworld.com> wrote in message
news:ago050dtfa2024tnuneotj0nritbqqddj5@4ax.com...
> On Thu, 11 Mar 2004 16:35:22 +0800, Kelvin @ SG wrote:
>
> >what is the oftenly used algorithms for RC4 encryption in an FPGA?
>
> Same as in software. The algorithm for RC4 is published by RSA Labs,
> but how you implement it is up to you.
>
> >is it true that i have to use at least 256X8X2 registers or RAM?
>
> It's a permutation cypher. I can't remember the block size offhand,
> but that sounds about right (particularly with block-chaining).
>
> It's quite a modest RAM requirement, as crypto algorithms go.
>
> --
>   Max

Article: 67443
Subject: Re: Answering Machine RAM
From: Allan Herriman <allan.herriman.hates.spam@ctam.com.au.invalid>
Date: Fri, 12 Mar 2004 12:47:05 +1100
Links: << >> << T >> << A >>

On Thu, 11 Mar 2004 14:44:09 -0000, "Jim" <me@privacy.net> wrote:

>Rather than FLASH, wouldn't the normal approach be to fit battery-backup for
>the real-time clock? It would mean very occassional battery replacement but
>it would probably last 5 years or so, by which time they'd likely call the
>product obsolete ;)

The "normal" approach is not to have an RTC chip at all; they cost
money.
The clock is probably implemented in software on the MCU.  Battery
backup for an MCU is a little more complicated, unless it has been
designed in right from the beginning.

Regards,
Allan.

Article: 67444
Subject: Three multipliers for FUNC_MULTI(A, B) in a 3 branch case statement?
From: "Kelvin @ SG" <kelvin8157@hotmail.com>
Date: Fri, 12 Mar 2004 10:16:05 +0800
Links: << >> << T >> << A >>

Hi, there:

FUNC_MULTI(A, B) is a handcrafted multiplier to avoid Xilinx's
components...then it is used in a
case statement...

It seemed ISE makes three FUNC_MULTI instead of mux the inputs?

Is this true?


Kelvin

Article: 67445
Subject: Re: Oftenly used hardware algorithm for RC4 encryption?
From: Max <mtj2@btopenworld.com>
Date: Fri, 12 Mar 2004 02:21:53 +0000 (UTC)
Links: << >> << T >> << A >>

On Fri, 12 Mar 2004 09:29:38 +0800, Kelvin @ SG wrote:

>well, more specific...
>what is the oftenly used implementation method for RC4 encryption? It seems
>to me
>the original 20 lines of C codes is not so suitable for hardware
>implementation...

RSA will supply you a detailed description of the algorithm when you
buy the licence to use RC4, together with implementation guidelines
and various other documents.

-- 
  Max

Article: 67446
Subject: Re: Quartus II 3.0 sp1 web, verilog input, memories optimized away ?
From: "Subroto Datta" <sdatta@altera.com>
Date: Fri, 12 Mar 2004 02:23:38 GMT
Links: << >> << T >> << A >>

Hi Raymund,

When compiling this design for Cyclone with 3.0 SP1 and 4.0 the design
compiles fine. All 5 RAMs are inferre with depth = 512 and data width = 8,
which is what is described in the Verilog design file. From your description
it is unclear , what the problem is.

- Subroto Datta
Altera Corp.


"raymund hofmann" <filter001@desinformation.de> wrote in message
news:c2ppu5$bm$1@online.de...
> I am trying this:
>
> module uv_v_filter
>   (
>     input       clk,
>     input       active,
>     input[7:0]  y,
>     input[7:0]  u,
>     input[7:0]  v,
>
>     output      activeout,
>     output[7:0] yout,
>     output[7:0] uout,
>     output[7:0] vout
>   );
>
>   parameter log2samplesline=9;
>   reg[log2samplesline-1:0]  samplecounter;
>   reg[7:0]  yout, uout, vout;
>
>   integer i;
>
>   // line memories
>
>   reg       activeout;
>
>   reg[7:0]  yshift0[0:(1<<log2samplesline)-1];
>
>   reg[7:0]  ushift0[0:(1<<log2samplesline)-1];
>   reg[7:0]  ushift1[0:(1<<log2samplesline)-1];
>
>   reg[7:0]  vshift0[0:(1<<log2samplesline)-1];
>   reg[7:0]  vshift1[0:(1<<log2samplesline)-1];
>
>   always @(posedge clk)
>    begin
>     yout<=yshift0[samplecounter];
>     uout<=(   u                       * 1
>             + ushift0[samplecounter]  * 2
>             + ushift1[samplecounter]  * 1
>             + 2
>          )>>2;
>     vout<=(   v                       * 1
>             + vshift0[samplecounter]  * 2
>             + vshift1[samplecounter]  * 1
>             + 2
>          )>>2;
>     // shift registers / line memories
>     if (active)
>      begin
>       ushift0[samplecounter]<=ushift1[samplecounter];
>       vshift0[samplecounter]<=vshift1[samplecounter];
>
>       yshift0[samplecounter]<=y;
>       ushift1[samplecounter]<=u;
>       vshift1[samplecounter]<=v;
>
>       samplecounter<=samplecounter+1;
>      end
>     else
>      begin
>       samplecounter<=0;
>      end
>
>     activeout<=active;
>
>    end
>
> endmodule
>
> And Quartus seems to optimize the memories away to a strange number of 22
> bit's, targeting a cyclone.
> And funny is:
> vshift1 gets 20 bits.
> yshift0 gets 2 bits
>
> Is Quartus or me the problem ?
>
> Raymund Hofmann
>
>

Article: 67447
Subject: Re: Answering Machine RAM
From: Ray Andraka <ray@andraka.com>
Date: Thu, 11 Mar 2004 21:47:30 -0500
Links: << >> << T >> << A >>

Just plug the darned thing into your computer's UPS and be done with it!

Allan Herriman wrote:

> On Thu, 11 Mar 2004 14:44:09 -0000, "Jim" <me@privacy.net> wrote:
>
> >Rather than FLASH, wouldn't the normal approach be to fit battery-backup for
> >the real-time clock? It would mean very occassional battery replacement but
> >it would probably last 5 years or so, by which time they'd likely call the
> >product obsolete ;)
>
> The "normal" approach is not to have an RTC chip at all; they cost
> money.
> The clock is probably implemented in software on the MCU.  Battery
> backup for an MCU is a little more complicated, unless it has been
> designed in right from the beginning.
>
> Regards,
> Allan.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 67448
Subject: Does XST handles //synopsys parallel_case?
From: "Kelvin @ SG" <kelvin8157@hotmail.com>
Date: Fri, 12 Mar 2004 11:01:36 +0800
Links: << >> << T >> << A >>

Hi, there:

I am wondering whether XST handles // synopsys parallel_case?

Plus, How do I know XST has taken in the // synthesis parallel_case? I
didn't see any such
information in the synthesis transcripts with either //synopsys or
//synthesis...

Best Regards,
Kelvin

Article: 67449
Subject: Virtex 2 P -> PPC write to block RAM
From: Matthew E Rosenthal <mer2@andrew.cmu.edu>
Date: Thu, 11 Mar 2004 22:24:10 -0500 (EST)
Links: << >> << T >> << A >>

Hi,
I have been creating a design extensively in hardware and I would like to
be able to have a powerPC write to block RAMs that are in my HW design.
Can someone point me in the right direction.

What sort of bus do i need to create?

Any pointers would be much appreciated.

Thanks

Matt

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search